From owner-freebsd-performance@FreeBSD.ORG Sun Aug 19 00:58:11 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 77E471065670; Sun, 19 Aug 2012 00:58:11 +0000 (UTC) (envelope-from gezeala@gmail.com) Received: from mail-pz0-f54.google.com (mail-pz0-f54.google.com [209.85.210.54]) by mx1.freebsd.org (Postfix) with ESMTP id 36C848FC19; Sun, 19 Aug 2012 00:58:10 +0000 (UTC) Received: by dadr6 with SMTP id r6so1541988dad.13 for ; Sat, 18 Aug 2012 17:58:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=vBthfF7cM2GM8MfF2Z1G74V50SjWcaGxMzbEs8lL65Q=; b=MDS2wxQSw4olskJn5vw8EYa5bdflzYhMQHBQdjUkAmeQHNvgZPC+XOihhfRNdm6NFQ 2KJ6e4j7pvejvcKolm5VIVz1lw82xlT5g7QochYuu15B0MfcM0vPNBMIEHNRGagRZi7i 7zs64jMRb4ZuKhjXvdcQB8VUlFSohBTL10EIRoAcgkoXA1/gb5zovN3smZ++Z2vHJgCF v+9CUHpkCDrLsHH7t2ND9WW5Ql7aC+sXwbhR2pJiz2EGNX3ew6s2sRQRvL4hiNUPIUGR Ssr68lI4765VHYFLnK9FuUWMGtKdC6/gGcgFzvfedPhNgrCKG36hZv/NzVrH5DLj+1ac 8oCw== Received: by 10.68.230.232 with SMTP id tb8mr23171802pbc.19.1345337890432; Sat, 18 Aug 2012 17:58:10 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.190.71 with HTTP; Sat, 18 Aug 2012 17:57:50 -0700 (PDT) In-Reply-To: <502FE98E.40807@rice.edu> References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= Date: Sat, 18 Aug 2012 17:57:50 -0700 Message-ID: To: Alan Cox Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 19 Aug 2012 00:58:11 -0000 On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: > On 08/17/2012 17:08, Gezeala M. Bacu=F1o II wrote: >> >> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>> >>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., the >>> region where the kernel's slab and malloc()-like memory allocators obta= in >>> their memory. While this heap may occupy the largest portion of the >>> kernel's virtual address space, it cannot occupy the entirety of the >>> address >>> space. There are other things that must be given space within the >>> kernel's >>> address space, for example, the file system buffer map. >>> >>> ZFS does not, however, use the regular file system buffer cache. The AR= C >>> takes its place, and the ARC abuses the kernel's heap like nothing else= . >>> So, if you are running a machine that only makes trivial use of a non-Z= FS >>> file system, like you boot from UFS, but store all of your data in ZFS, >>> then >>> you can dramatically reduce the size of the buffer map via boot loader >>> tuneables and proportionately increase vm.kmem_size. >>> >>> Any further increases in the kernel virtual address space size will, >>> however, require code changes. Small changes, but changes nonetheless. >>> >>> Alan >>> >>> >> <> >> >>>> Additional Info: >>>> 1] Installed using PCBSD-9 Release amd64. >>>> >>>> 2] uname -a >>>> FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD >>>> 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 >>>> >>>> >>>> root@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-sour= ce/9.0/sys/GENERIC >>>> amd64 >>>> >>>> 3] first few lines from /var/run/dmesg.boot: >>>> FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 >>>> >>>> >>>> root@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-sour= ce/9.0/sys/GENERIC >>>> amd64 >>>> CPU: Intel(R) Xeon(R) CPU E7- 8837 @ 2.67GHz (2666.82-MHz K8-class CP= U) >>>> Origin =3D "GenuineIntel" Id =3D 0x206f2 Family =3D 6 Model =3D= 2f >>>> Stepping >>>> =3D 2 >>>> >>>> >>>> Features=3D0xbfebfbff >>>> >>>> >>>> Features2=3D0x29ee3ff >>>> AMD Features=3D0x2c100800 >>>> AMD Features2=3D0x1 >>>> TSC: P-state invariant, performance statistics >>>> real memory =3D 549755813888 (524288 MB) >>>> avail memory =3D 530339893248 (505771 MB) >>>> Event timer "LAPIC" quality 600 >>>> ACPI APIC Table: >>>> FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs >>>> FreeBSD/SMP: 8 package(s) x 8 core(s) >>>> >>>> 4] relevant sysctl's with manual tuning: >>>> kern.maxusers: 384 >>>> kern.maxvnodes: 8222162 >>>> vfs.numvnodes: 675740 >>>> vfs.freevnodes: 417524 >>>> kern.ipc.somaxconn: 128 >>>> kern.openfiles: 5238 >>>> vfs.zfs.arc_max: 428422987776 >>>> vfs.zfs.arc_min: 53552873472 >>>> vfs.zfs.arc_meta_used: 3167391088 >>>> vfs.zfs.arc_meta_limit: 107105746944 >>>> vm.kmem_size_max: 429496729600 =3D=3D>> manually tuned >>>> vm.kmem_size: 429496729600 =3D=3D>> manually tuned >>>> vm.kmem_map_free: 107374727168 >>>> vm.kmem_map_size: 144625156096 >>>> vfs.wantfreevnodes: 2055540 >>>> kern.minvnodes: 2055540 >>>> kern.maxfiles: 197248 =3D=3D>> manually tuned >>>> vm.vmtotal: >>>> System wide totals computed every five seconds: (values in kilobytes) >>>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>>> Processes: (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150) >>>> Virtual Memory: (Total: 1086325716K Active: 12377876K) >>>> Real Memory: (Total: 144143408K Active: 803432K) >>>> Shared Virtual Memory: (Total: 81384K Active: 37560K) >>>> Shared Real Memory: (Total: 32224K Active: 27548K) >>>> Free Memory Pages: 365565564K >>>> >>>> hw.availpages: 134170294 >>>> hw.physmem: 549561524224 >>>> hw.usermem: 391395241984 >>>> hw.realmem: 551836188672 >>>> vm.kmem_size_scale: 1 >>>> kern.ipc.nmbclusters: 2560000 =3D=3D>> manually tuned >>>> kern.ipc.maxsockbuf: 2097152 >>>> net.inet.tcp.sendbuf_max: 2097152 >>>> net.inet.tcp.recvbuf_max: 2097152 >>>> kern.maxfilesperproc: 18000 >>>> net.inet.ip.intr_queue_maxlen: 256 >>>> kern.maxswzone: 33554432 >>>> kern.ipc.shmmax: 10737418240 =3D=3D>> manually tuned >>>> kern.ipc.shmall: 2621440 =3D=3D>> manually tuned >>>> vfs.zfs.write_limit_override: 0 >>>> vfs.zfs.prefetch_disable: 0 >>>> hw.pagesize: 4096 >>>> hw.availpages: 134170294 >>>> kern.ipc.maxpipekva: 8586895360 >>>> kern.ipc.shm_use_phys: 1 =3D=3D>> manually tuned >>>> vfs.vmiodirenable: 1 >>>> debug.numcache: 632148 >>>> vfs.ncsizefactor: 2 >>>> vm.kvm_size: 549755809792 >>>> vm.kvm_free: 54456741888 >>>> kern.ipc.semmni: 256 >>>> kern.ipc.semmns: 512 >>>> kern.ipc.semmnu: 256 >>>> >> Thanks. It will be mainly used for postgreSQL and java. We have a huge >> db (3TB and growing) and we need to have as much of it as we can on >> zfs' ARC. All data resides on zpools while root is on ufs. On 8.2 and >> 9 machines vm.kmem_size is always auto-tuned to almost the same size >> as our installed RAM. What I've tuned on those machines is lower >> vfs.zfs.arc_max to 50% or 75% of vm.kmem_size and that have worked >> well for us and the machines does not swap out. Now on this machine, I >> do think that I need to adjust my formula for tuning vfs.zfs.arc_max, >> 25% for other stuff is probably overkill. >> >> We were able to successfully bump vm.kmem_size_max and vm.kmem_size to >> 400GB: >> vm.kmem_size_max: 429496729600 =3D=3D>> manually tuned >> vm.kmem_size: 429496729600 =3D=3D>> manually tuned >> vfs.zfs.arc_max: 428422987776 =3D=3D>> auto-tuned (vm.kmem_size - 1G) >> vfs.zfs.arc_min: 53552873472 =3D=3D>> auto-tuned >> >> Which other tuneables do I need to set on /boot/loader.conf so we can >> boot the machine with vm.kmem_size> 400G. As I don't know which part >> of the boot-up process is failing with vm.kmem_size/_max set to 450G >> or 500G, I have no idea which to tune next. > > > > Your objective should be to reduce the value of "sysctl vfs.maxbufspace". > You can do this by setting the loader.conf tuneable "kern.maxbcache" to t= he > desired value. > > What does your machine currently report for "sysctl vfs.maxbufspace"? > Here you go: vfs.maxbufspace: 54967025664 kern.maxbcache: 0 Other (probably) relevant values: vfs.hirunningspace: 16777216 vfs.lorunningspace: 11206656 vfs.bufdefragcnt: 0 vfs.buffreekvacnt: 2 vfs.bufreusecnt: 320149 vfs.hibufspace: 54966370304 vfs.lobufspace: 54966304768 vfs.maxmallocbufspace: 2748318515 vfs.bufmallocspace: 0 vfs.bufspace: 10490478592 vfs.runningbufspace: 0 Let me know if you need other tuneables or sysctl values. Thanks a lot for looking into this. From owner-freebsd-performance@FreeBSD.ORG Mon Aug 20 15:22:32 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F09271065670; Mon, 20 Aug 2012 15:22:31 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh11.mail.rice.edu (mh11.mail.rice.edu [128.42.199.30]) by mx1.freebsd.org (Postfix) with ESMTP id B6F578FC14; Mon, 20 Aug 2012 15:22:31 +0000 (UTC) Received: from mh11.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh11.mail.rice.edu (Postfix) with ESMTP id 412A44C01EA; Mon, 20 Aug 2012 10:22:31 -0500 (CDT) Received: from mh11.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh11.mail.rice.edu (Postfix) with ESMTP id 3F9FD4C01E0; Mon, 20 Aug 2012 10:22:31 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh11.mail.rice.edu, auth channel Received: from mh11.mail.rice.edu ([127.0.0.1]) by mh11.mail.rice.edu (mh11.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id 9Hhej0BMPt7H; Mon, 20 Aug 2012 10:22:31 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh11.mail.rice.edu (Postfix) with ESMTPSA id 956A64C01C4; Mon, 20 Aug 2012 10:22:30 -0500 (CDT) Message-ID: <50325634.7090904@rice.edu> Date: Mon, 20 Aug 2012 10:22:28 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: =?ISO-8859-1?Q?=22Gezeala_M=2E_Bacu=F1o_II=22?= References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit X-Mailman-Approved-At: Mon, 20 Aug 2012 15:26:27 +0000 Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Aug 2012 15:22:32 -0000 On 08/18/2012 19:57, Gezeala M. Bacuņo II wrote: > On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: >> On 08/17/2012 17:08, Gezeala M. Bacuņo II wrote: >>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., the >>>> region where the kernel's slab and malloc()-like memory allocators obtain >>>> their memory. While this heap may occupy the largest portion of the >>>> kernel's virtual address space, it cannot occupy the entirety of the >>>> address >>>> space. There are other things that must be given space within the >>>> kernel's >>>> address space, for example, the file system buffer map. >>>> >>>> ZFS does not, however, use the regular file system buffer cache. The ARC >>>> takes its place, and the ARC abuses the kernel's heap like nothing else. >>>> So, if you are running a machine that only makes trivial use of a non-ZFS >>>> file system, like you boot from UFS, but store all of your data in ZFS, >>>> then >>>> you can dramatically reduce the size of the buffer map via boot loader >>>> tuneables and proportionately increase vm.kmem_size. >>>> >>>> Any further increases in the kernel virtual address space size will, >>>> however, require code changes. Small changes, but changes nonetheless. >>>> >>>> Alan >>>> >>>> >>> <> >>> >>>>> Additional Info: >>>>> 1] Installed using PCBSD-9 Release amd64. >>>>> >>>>> 2] uname -a >>>>> FreeBSD fmt-iscsi-stg1.musicreports.com 9.0-RELEASE FreeBSD >>>>> 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 >>>>> >>>>> >>>>> root@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC >>>>> amd64 >>>>> >>>>> 3] first few lines from /var/run/dmesg.boot: >>>>> FreeBSD 9.0-RELEASE #3: Tue Dec 27 14:14:29 PST 2011 >>>>> >>>>> >>>>> root@build9x64.pcbsd.org:/usr/obj/builds/amd64/pcbsd-build90/fbsd-source/9.0/sys/GENERIC >>>>> amd64 >>>>> CPU: Intel(R) Xeon(R) CPU E7- 8837 @ 2.67GHz (2666.82-MHz K8-class CPU) >>>>> Origin = "GenuineIntel" Id = 0x206f2 Family = 6 Model = 2f >>>>> Stepping >>>>> = 2 >>>>> >>>>> >>>>> Features=0xbfebfbff >>>>> >>>>> >>>>> Features2=0x29ee3ff >>>>> AMD Features=0x2c100800 >>>>> AMD Features2=0x1 >>>>> TSC: P-state invariant, performance statistics >>>>> real memory = 549755813888 (524288 MB) >>>>> avail memory = 530339893248 (505771 MB) >>>>> Event timer "LAPIC" quality 600 >>>>> ACPI APIC Table: >>>>> FreeBSD/SMP: Multiprocessor System Detected: 64 CPUs >>>>> FreeBSD/SMP: 8 package(s) x 8 core(s) >>>>> >>>>> 4] relevant sysctl's with manual tuning: >>>>> kern.maxusers: 384 >>>>> kern.maxvnodes: 8222162 >>>>> vfs.numvnodes: 675740 >>>>> vfs.freevnodes: 417524 >>>>> kern.ipc.somaxconn: 128 >>>>> kern.openfiles: 5238 >>>>> vfs.zfs.arc_max: 428422987776 >>>>> vfs.zfs.arc_min: 53552873472 >>>>> vfs.zfs.arc_meta_used: 3167391088 >>>>> vfs.zfs.arc_meta_limit: 107105746944 >>>>> vm.kmem_size_max: 429496729600 ==>> manually tuned >>>>> vm.kmem_size: 429496729600 ==>> manually tuned >>>>> vm.kmem_map_free: 107374727168 >>>>> vm.kmem_map_size: 144625156096 >>>>> vfs.wantfreevnodes: 2055540 >>>>> kern.minvnodes: 2055540 >>>>> kern.maxfiles: 197248 ==>> manually tuned >>>>> vm.vmtotal: >>>>> System wide totals computed every five seconds: (values in kilobytes) >>>>> =============================================== >>>>> Processes: (RUNQ: 1 Disk Wait: 1 Page Wait: 0 Sleep: 150) >>>>> Virtual Memory: (Total: 1086325716K Active: 12377876K) >>>>> Real Memory: (Total: 144143408K Active: 803432K) >>>>> Shared Virtual Memory: (Total: 81384K Active: 37560K) >>>>> Shared Real Memory: (Total: 32224K Active: 27548K) >>>>> Free Memory Pages: 365565564K >>>>> >>>>> hw.availpages: 134170294 >>>>> hw.physmem: 549561524224 >>>>> hw.usermem: 391395241984 >>>>> hw.realmem: 551836188672 >>>>> vm.kmem_size_scale: 1 >>>>> kern.ipc.nmbclusters: 2560000 ==>> manually tuned >>>>> kern.ipc.maxsockbuf: 2097152 >>>>> net.inet.tcp.sendbuf_max: 2097152 >>>>> net.inet.tcp.recvbuf_max: 2097152 >>>>> kern.maxfilesperproc: 18000 >>>>> net.inet.ip.intr_queue_maxlen: 256 >>>>> kern.maxswzone: 33554432 >>>>> kern.ipc.shmmax: 10737418240 ==>> manually tuned >>>>> kern.ipc.shmall: 2621440 ==>> manually tuned >>>>> vfs.zfs.write_limit_override: 0 >>>>> vfs.zfs.prefetch_disable: 0 >>>>> hw.pagesize: 4096 >>>>> hw.availpages: 134170294 >>>>> kern.ipc.maxpipekva: 8586895360 >>>>> kern.ipc.shm_use_phys: 1 ==>> manually tuned >>>>> vfs.vmiodirenable: 1 >>>>> debug.numcache: 632148 >>>>> vfs.ncsizefactor: 2 >>>>> vm.kvm_size: 549755809792 >>>>> vm.kvm_free: 54456741888 >>>>> kern.ipc.semmni: 256 >>>>> kern.ipc.semmns: 512 >>>>> kern.ipc.semmnu: 256 >>>>> >>> Thanks. It will be mainly used for postgreSQL and java. We have a huge >>> db (3TB and growing) and we need to have as much of it as we can on >>> zfs' ARC. All data resides on zpools while root is on ufs. On 8.2 and >>> 9 machines vm.kmem_size is always auto-tuned to almost the same size >>> as our installed RAM. What I've tuned on those machines is lower >>> vfs.zfs.arc_max to 50% or 75% of vm.kmem_size and that have worked >>> well for us and the machines does not swap out. Now on this machine, I >>> do think that I need to adjust my formula for tuning vfs.zfs.arc_max, >>> 25% for other stuff is probably overkill. >>> >>> We were able to successfully bump vm.kmem_size_max and vm.kmem_size to >>> 400GB: >>> vm.kmem_size_max: 429496729600 ==>> manually tuned >>> vm.kmem_size: 429496729600 ==>> manually tuned >>> vfs.zfs.arc_max: 428422987776 ==>> auto-tuned (vm.kmem_size - 1G) >>> vfs.zfs.arc_min: 53552873472 ==>> auto-tuned >>> >>> Which other tuneables do I need to set on /boot/loader.conf so we can >>> boot the machine with vm.kmem_size> 400G. As I don't know which part >>> of the boot-up process is failing with vm.kmem_size/_max set to 450G >>> or 500G, I have no idea which to tune next. >> >> >> Your objective should be to reduce the value of "sysctl vfs.maxbufspace". >> You can do this by setting the loader.conf tuneable "kern.maxbcache" to the >> desired value. >> >> What does your machine currently report for "sysctl vfs.maxbufspace"? >> > Here you go: > vfs.maxbufspace: 54967025664 > kern.maxbcache: 0 Try setting kern.maxbcache to two billion and adding 50 billion to the setting of vm.kmem_size{,_max}. > Other (probably) relevant values: > vfs.hirunningspace: 16777216 > vfs.lorunningspace: 11206656 > vfs.bufdefragcnt: 0 > vfs.buffreekvacnt: 2 > vfs.bufreusecnt: 320149 > vfs.hibufspace: 54966370304 > vfs.lobufspace: 54966304768 > vfs.maxmallocbufspace: 2748318515 > vfs.bufmallocspace: 0 > vfs.bufspace: 10490478592 > vfs.runningbufspace: 0 > > Let me know if you need other tuneables or sysctl values. Thanks a lot > for looking into this. > From owner-freebsd-performance@FreeBSD.ORG Mon Aug 20 16:07:34 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EBC9B106564A; Mon, 20 Aug 2012 16:07:33 +0000 (UTC) (envelope-from gezeala@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id A97C78FC08; Mon, 20 Aug 2012 16:07:33 +0000 (UTC) Received: by pbbrp2 with SMTP id rp2so7553168pbb.13 for ; Mon, 20 Aug 2012 09:07:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=XFzOpBC5wyfukV74RDjM3eXLv26xoMPcW2FrCMpkxnM=; b=w+F9GaO4Zh37Ljt9LKmVv9yVnKuSFNAYOZ+0EndJfYMLAT5zHGJ6V6ujde8yI3tw3g 0+A+8P2dDQzj4UMYVK5zSMXS/KsQ1MBgbeyrg3fcawm9LNxT7UJ3SE30FrmMzbIPj4NZ VjLhXugLKV88EIIbm6NZfdJrFRqKDZDUykGfmM6Ngv0iRFw0cKZ2Hh75MsCoWZti9sG9 4Xmy8ppmhLCuIgM5tyN6WV93tsmV1IRB66gmGE8qsMDueInapBT/1F4e3vdOcZ4Wfsbp o5I3V1sONdjqb9JD135NX8tSIvF3fEl1hJzGE5uY24CQjJjWn8eKYtGQh2mPklum/FcJ CPJw== Received: by 10.66.87.66 with SMTP id v2mr30565727paz.71.1345478853012; Mon, 20 Aug 2012 09:07:33 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.117.145 with HTTP; Mon, 20 Aug 2012 09:07:12 -0700 (PDT) In-Reply-To: <50325634.7090904@rice.edu> References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= Date: Mon, 20 Aug 2012 09:07:12 -0700 Message-ID: To: Alan Cox Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 20 Aug 2012 16:07:34 -0000 On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox wrote: > On 08/18/2012 19:57, Gezeala M. Bacu=F1o II wrote: >> >> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: >>> >>> On 08/17/2012 17:08, Gezeala M. Bacu=F1o II wrote: >>>> >>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>>>> >>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., th= e >>>>> region where the kernel's slab and malloc()-like memory allocators >>>>> obtain >>>>> their memory. While this heap may occupy the largest portion of the >>>>> kernel's virtual address space, it cannot occupy the entirety of the >>>>> address >>>>> space. There are other things that must be given space within the >>>>> kernel's >>>>> address space, for example, the file system buffer map. >>>>> >>>>> ZFS does not, however, use the regular file system buffer cache. The >>>>> ARC >>>>> takes its place, and the ARC abuses the kernel's heap like nothing >>>>> else. >>>>> So, if you are running a machine that only makes trivial use of a >>>>> non-ZFS >>>>> file system, like you boot from UFS, but store all of your data in ZF= S, >>>>> then >>>>> you can dramatically reduce the size of the buffer map via boot loade= r >>>>> tuneables and proportionately increase vm.kmem_size. >>>>> >>>>> Any further increases in the kernel virtual address space size will, >>>>> however, require code changes. Small changes, but changes nonetheles= s. >>>>> >>>>> Alan >>>>> <> >>> Your objective should be to reduce the value of "sysctl vfs.maxbufspace= ". >>> You can do this by setting the loader.conf tuneable "kern.maxbcache" to >>> the >>> desired value. >>> >>> What does your machine currently report for "sysctl vfs.maxbufspace"? >>> >> Here you go: >> vfs.maxbufspace: 54967025664 >> kern.maxbcache: 0 > > > Try setting kern.maxbcache to two billion and adding 50 billion to the > setting of vm.kmem_size{,_max}. > > Thank you. We'll try this and post back results. >> Other (probably) relevant values: >> vfs.hirunningspace: 16777216 >> vfs.lorunningspace: 11206656 >> vfs.bufdefragcnt: 0 >> vfs.buffreekvacnt: 2 >> vfs.bufreusecnt: 320149 >> vfs.hibufspace: 54966370304 >> vfs.lobufspace: 54966304768 >> vfs.maxmallocbufspace: 2748318515 >> vfs.bufmallocspace: 0 >> vfs.bufspace: 10490478592 >> vfs.runningbufspace: 0 >> >> Let me know if you need other tuneables or sysctl values. Thanks a lot >> for looking into this. >> > From owner-freebsd-performance@FreeBSD.ORG Tue Aug 21 01:27:19 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 65B24106566C; Tue, 21 Aug 2012 01:27:19 +0000 (UTC) (envelope-from gezeala@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1D4278FC08; Tue, 21 Aug 2012 01:27:18 +0000 (UTC) Received: by pbbrp2 with SMTP id rp2so8196829pbb.13 for ; Mon, 20 Aug 2012 18:27:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=cySjppLIqnuggfSvYjQN1mA9Nfm1hRGER/v4QzDFiSg=; b=EtRfA2sDans47q2ceuiC5ZSpJGwjC8KoqTpV1lW7WRF4900Zij3GBaj7+3iVvjNGx6 0bEqg90/OBNx8hzGIyZpr4zn8DThh7aLQbtMyVoK3YIX6INNb6WehuWdaSgrSZiNJotx Ex5BdCcGMWo9HDljnxFBFoKcr/bfKV6I53eudOio7dA3+LjWKda/O26SQPiT3P3D0KBJ vvifFfl38UkksiAZMHyla+j4IK8tVAlOoS5ZSazjsAmo/653CXR0ZwHr/muDwMwhcxAI FUsjseHiQERpi/BnqgknRqE1YD0AwZjlSlnN5rbWAKFO9JvS33uvaiyPyz7h3pkG00nj Alzw== Received: by 10.68.237.38 with SMTP id uz6mr33921625pbc.23.1345512438429; Mon, 20 Aug 2012 18:27:18 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.117.145 with HTTP; Mon, 20 Aug 2012 18:26:58 -0700 (PDT) In-Reply-To: References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= Date: Mon, 20 Aug 2012 18:26:58 -0700 Message-ID: To: Alan Cox Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Aug 2012 01:27:19 -0000 On Mon, Aug 20, 2012 at 9:07 AM, Gezeala M. Bacu=F1o II = wrote: > On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox wrote: >> On 08/18/2012 19:57, Gezeala M. Bacu=F1o II wrote: >>> >>> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: >>>> >>>> On 08/17/2012 17:08, Gezeala M. Bacu=F1o II wrote: >>>>> >>>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>>>>> >>>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., t= he >>>>>> region where the kernel's slab and malloc()-like memory allocators >>>>>> obtain >>>>>> their memory. While this heap may occupy the largest portion of the >>>>>> kernel's virtual address space, it cannot occupy the entirety of the >>>>>> address >>>>>> space. There are other things that must be given space within the >>>>>> kernel's >>>>>> address space, for example, the file system buffer map. >>>>>> >>>>>> ZFS does not, however, use the regular file system buffer cache. The >>>>>> ARC >>>>>> takes its place, and the ARC abuses the kernel's heap like nothing >>>>>> else. >>>>>> So, if you are running a machine that only makes trivial use of a >>>>>> non-ZFS >>>>>> file system, like you boot from UFS, but store all of your data in Z= FS, >>>>>> then >>>>>> you can dramatically reduce the size of the buffer map via boot load= er >>>>>> tuneables and proportionately increase vm.kmem_size. >>>>>> >>>>>> Any further increases in the kernel virtual address space size will, >>>>>> however, require code changes. Small changes, but changes nonethele= ss. >>>>>> >>>>>> Alan >>>>>> > <> >>>> Your objective should be to reduce the value of "sysctl vfs.maxbufspac= e". >>>> You can do this by setting the loader.conf tuneable "kern.maxbcache" t= o >>>> the >>>> desired value. >>>> >>>> What does your machine currently report for "sysctl vfs.maxbufspace"? >>>> >>> Here you go: >>> vfs.maxbufspace: 54967025664 >>> kern.maxbcache: 0 >> >> >> Try setting kern.maxbcache to two billion and adding 50 billion to the >> setting of vm.kmem_size{,_max}. >> 2 : 50 =3D=3D>> is this the ratio for further tuning kern.maxbcache:vm.kmem_size? Is kern.maxbcache also in bytes? > > Thank you. We'll try this and post back results. > >>> Other (probably) relevant values: >>> vfs.hirunningspace: 16777216 >>> vfs.lorunningspace: 11206656 >>> vfs.bufdefragcnt: 0 >>> vfs.buffreekvacnt: 2 >>> vfs.bufreusecnt: 320149 >>> vfs.hibufspace: 54966370304 >>> vfs.lobufspace: 54966304768 >>> vfs.maxmallocbufspace: 2748318515 >>> vfs.bufmallocspace: 0 >>> vfs.bufspace: 10490478592 >>> vfs.runningbufspace: 0 >>> >>> Let me know if you need other tuneables or sysctl values. Thanks a lot >>> for looking into this. >>> >> Setting the following as instructed, machine started successfully with 446GB for vm.kmem_size/_max. kern.maxbcache: 2000000000 vm.kmem_size_max: 479496729600 vm.kmem_size: 479496729600 ## auto-tuned vfs.maxbufspace: 1999994880 ... vfs.hirunningspace: 16777216 vfs.lorunningspace: 11206656 vfs.bufdefragcnt: 0 vfs.buffreekvacnt: 2 vfs.bufreusecnt: 11511 vfs.hibufspace: 1999339520 vfs.lobufspace: 1999273984 vfs.maxmallocbufspace: 99966976 vfs.bufmallocspace: 0 vfs.bufspace: 377028608 vfs.runningbufspace: 0 ## additional manual tuning vfs.zfs.arc_max: 455521893120 vfs.zfs.arc_min: 227760946560 kern.ipc.semmni: 256 kern.ipc.semmns: 512 kern.ipc.semmnu: 256 kern.ipc.shm_use_phys: 1 kern.ipc.shmmax: 24000000000 kern.ipc.shmall: 5859375 kern.ipc.nmbclusters: 2560000 kern.maxfiles: 197248 We'll do some further tests and report back if there are any issues. Thanks a lot!! From owner-freebsd-performance@FreeBSD.ORG Tue Aug 21 09:00:54 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34D8D106566B for ; Tue, 21 Aug 2012 09:00:54 +0000 (UTC) (envelope-from gofp-freebsd-performance@m.gmane.org) Received: from plane.gmane.org (plane.gmane.org [80.91.229.3]) by mx1.freebsd.org (Postfix) with ESMTP id D4CC38FC14 for ; Tue, 21 Aug 2012 09:00:53 +0000 (UTC) Received: from list by plane.gmane.org with local (Exim 4.69) (envelope-from ) id 1T3kKS-0006Sq-Rt for freebsd-performance@freebsd.org; Tue, 21 Aug 2012 11:00:52 +0200 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 21 Aug 2012 11:00:52 +0200 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 21 Aug 2012 11:00:52 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-performance@freebsd.org From: Ivan Voras Date: Tue, 21 Aug 2012 11:00:39 +0200 Lines: 31 Message-ID: References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="------------enigC9021945C244D92B757458DB" X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:10.0) Gecko/20120213 Thunderbird/10.0 In-Reply-To: <50325634.7090904@rice.edu> X-Enigmail-Version: 1.3.5 Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Aug 2012 09:00:54 -0000 This is an OpenPGP/MIME signed message (RFC 2440 and 3156) --------------enigC9021945C244D92B757458DB Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 20/08/2012 17:22, Alan Cox wrote: > Try setting kern.maxbcache to two billion and adding 50 billion to the > setting of vm.kmem_size{,_max}. Just as a side-note: unless it has some side-effects, it is probably worth increasing these tunables by default, as RAM is very cheap again. 512 GB in a machine can be bought for less than $10,000. --------------enigC9021945C244D92B757458DB Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAlAzTjsACgkQ/QjVBj3/HSzG+gCghe64ZpTd+/NJJ+ARIpOklycF +y0An1A98uv27XZv0cOtW5dpvBU8jlc5 =h3GP -----END PGP SIGNATURE----- --------------enigC9021945C244D92B757458DB-- From owner-freebsd-performance@FreeBSD.ORG Tue Aug 21 23:24:55 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8688A1065690; Tue, 21 Aug 2012 23:24:55 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh10.mail.rice.edu (mh10.mail.rice.edu [128.42.201.30]) by mx1.freebsd.org (Postfix) with ESMTP id 4CCDD8FC0A; Tue, 21 Aug 2012 23:24:55 +0000 (UTC) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id BCA28604FF; Tue, 21 Aug 2012 18:24:54 -0500 (CDT) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id BAFE0604FC; Tue, 21 Aug 2012 18:24:54 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh10.mail.rice.edu, auth channel Received: from mh10.mail.rice.edu ([127.0.0.1]) by mh10.mail.rice.edu (mh10.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id u4wWBmChsKSm; Tue, 21 Aug 2012 18:24:54 -0500 (CDT) Received: from [10.74.20.46] (staff-74-dun20-046.rice.edu [10.74.20.46]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh10.mail.rice.edu (Postfix) with ESMTPSA id 93E1B603D8; Tue, 21 Aug 2012 18:24:54 -0500 (CDT) Message-ID: <503418C0.5000901@rice.edu> Date: Tue, 21 Aug 2012 18:24:48 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:14.0) Gecko/20120713 Thunderbird/14.0 MIME-Version: 1.0 To: =?ISO-8859-1?Q?=22Gezeala_M=2E_Bacu=F1o_II=22?= References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 21 Aug 2012 23:24:55 -0000 On 8/20/2012 8:26 PM, Gezeala M. Bacuņo II wrote: > On Mon, Aug 20, 2012 at 9:07 AM, Gezeala M. Bacuņo II wrote: >> On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox wrote: >>> On 08/18/2012 19:57, Gezeala M. Bacuņo II wrote: >>>> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: >>>>> On 08/17/2012 17:08, Gezeala M. Bacuņo II wrote: >>>>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>>>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., the >>>>>>> region where the kernel's slab and malloc()-like memory allocators >>>>>>> obtain >>>>>>> their memory. While this heap may occupy the largest portion of the >>>>>>> kernel's virtual address space, it cannot occupy the entirety of the >>>>>>> address >>>>>>> space. There are other things that must be given space within the >>>>>>> kernel's >>>>>>> address space, for example, the file system buffer map. >>>>>>> >>>>>>> ZFS does not, however, use the regular file system buffer cache. The >>>>>>> ARC >>>>>>> takes its place, and the ARC abuses the kernel's heap like nothing >>>>>>> else. >>>>>>> So, if you are running a machine that only makes trivial use of a >>>>>>> non-ZFS >>>>>>> file system, like you boot from UFS, but store all of your data in ZFS, >>>>>>> then >>>>>>> you can dramatically reduce the size of the buffer map via boot loader >>>>>>> tuneables and proportionately increase vm.kmem_size. >>>>>>> >>>>>>> Any further increases in the kernel virtual address space size will, >>>>>>> however, require code changes. Small changes, but changes nonetheless. >>>>>>> >>>>>>> Alan >>>>>>> >> <> >>>>> Your objective should be to reduce the value of "sysctl vfs.maxbufspace". >>>>> You can do this by setting the loader.conf tuneable "kern.maxbcache" to >>>>> the >>>>> desired value. >>>>> >>>>> What does your machine currently report for "sysctl vfs.maxbufspace"? >>>>> >>>> Here you go: >>>> vfs.maxbufspace: 54967025664 >>>> kern.maxbcache: 0 >>> >>> Try setting kern.maxbcache to two billion and adding 50 billion to the >>> setting of vm.kmem_size{,_max}. >>> > 2 : 50 ==>> is this the ratio for further tuning > kern.maxbcache:vm.kmem_size? Is kern.maxbcache also in bytes? > No, this is not a ratio. Yes, kern.maxbcache is in bytes. Basically, for every byte that you subtract from vfs.maxbufspace, through setting kern.maxbcache, you can add a byte to vm.kmem_size{,_max}. Alan From owner-freebsd-performance@FreeBSD.ORG Wed Aug 22 17:09:37 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EAFCC1065672; Wed, 22 Aug 2012 17:09:36 +0000 (UTC) (envelope-from gezeala@gmail.com) Received: from mail-pb0-f54.google.com (mail-pb0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id A67358FC12; Wed, 22 Aug 2012 17:09:36 +0000 (UTC) Received: by pbbrp2 with SMTP id rp2so1823108pbb.13 for ; Wed, 22 Aug 2012 10:09:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding; bh=ufXi3+0ZwFFZwVuYK44MCK11pJBFX6i3KI87TPydGys=; b=V9rGT02wh7yQQ/FWffbv2bTMPUzYeH2vTphw/4W9ULhbacS5lbZx9r0+c7DyRWDJBo KO5NPQN0XsjUTjrGnwE2Fn7Q8PybHIvXAgrrm8W56OFqlRF1Di4XCdJDtO942be0gFx7 AeQnOAcLvcafchhUKBQ3Xtvcwe36GZXNfed3mCMwCtWLqVocMfuYpZMzk3K6TaPy490L bDczg6cxIXV2+cl5GjwudOlJhigYnMz4hUliumdzZdBcEMxYJTdxzOMEkDc51ltWQ4rh 9QVucuvoQOW9slCfhIzB6q2TjPL1wQ4YsQN85mMpeqO2ihUjJO9O6RWsaZ9HAWuLE0iu MDvA== Received: by 10.68.237.38 with SMTP id uz6mr49092368pbc.23.1345655376062; Wed, 22 Aug 2012 10:09:36 -0700 (PDT) MIME-Version: 1.0 Received: by 10.68.117.145 with HTTP; Wed, 22 Aug 2012 10:09:15 -0700 (PDT) In-Reply-To: <503418C0.5000901@rice.edu> References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> <503418C0.5000901@rice.edu> From: =?ISO-8859-1?Q?Gezeala_M=2E_Bacu=F1o_II?= Date: Wed, 22 Aug 2012 10:09:15 -0700 Message-ID: To: Alan Cox Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 22 Aug 2012 17:09:37 -0000 On Tue, Aug 21, 2012 at 4:24 PM, Alan Cox wrote: > On 8/20/2012 8:26 PM, Gezeala M. Bacu=F1o II wrote: >> >> On Mon, Aug 20, 2012 at 9:07 AM, Gezeala M. Bacu=F1o II >> wrote: >>> >>> On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox wrote: >>>> >>>> On 08/18/2012 19:57, Gezeala M. Bacu=F1o II wrote: >>>>> >>>>> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: >>>>>> >>>>>> On 08/17/2012 17:08, Gezeala M. Bacu=F1o II wrote: >>>>>>> >>>>>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>>>>>>> >>>>>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., >>>>>>>> the >>>>>>>> region where the kernel's slab and malloc()-like memory allocators >>>>>>>> obtain >>>>>>>> their memory. While this heap may occupy the largest portion of t= he >>>>>>>> kernel's virtual address space, it cannot occupy the entirety of t= he >>>>>>>> address >>>>>>>> space. There are other things that must be given space within the >>>>>>>> kernel's >>>>>>>> address space, for example, the file system buffer map. >>>>>>>> >>>>>>>> ZFS does not, however, use the regular file system buffer cache. T= he >>>>>>>> ARC >>>>>>>> takes its place, and the ARC abuses the kernel's heap like nothing >>>>>>>> else. >>>>>>>> So, if you are running a machine that only makes trivial use of a >>>>>>>> non-ZFS >>>>>>>> file system, like you boot from UFS, but store all of your data in >>>>>>>> ZFS, >>>>>>>> then >>>>>>>> you can dramatically reduce the size of the buffer map via boot >>>>>>>> loader >>>>>>>> tuneables and proportionately increase vm.kmem_size. >>>>>>>> >>>>>>>> Any further increases in the kernel virtual address space size wil= l, >>>>>>>> however, require code changes. Small changes, but changes >>>>>>>> nonetheless. >>>>>>>> >>>>>>>> Alan >>>>>>>> >>> <> >>>>>> >>>>>> Your objective should be to reduce the value of "sysctl >>>>>> vfs.maxbufspace". >>>>>> You can do this by setting the loader.conf tuneable "kern.maxbcache" >>>>>> to >>>>>> the >>>>>> desired value. >>>>>> >>>>>> What does your machine currently report for "sysctl vfs.maxbufspace"= ? >>>>>> >>>>> Here you go: >>>>> vfs.maxbufspace: 54967025664 >>>>> kern.maxbcache: 0 >>>> >>>> >>>> Try setting kern.maxbcache to two billion and adding 50 billion to the >>>> setting of vm.kmem_size{,_max}. >>>> >> 2 : 50 =3D=3D>> is this the ratio for further tuning >> kern.maxbcache:vm.kmem_size? Is kern.maxbcache also in bytes? >> > > No, this is not a ratio. Yes, kern.maxbcache is in bytes. Basically, for > every byte that you subtract from vfs.maxbufspace, through setting > kern.maxbcache, you can add a byte to vm.kmem_size{,_max}. > > Alan > Great! Thanks. Are there other sysctls aside from vfs.bufspace that I should monitor for vfs.maxbufspace usage? I just want to make sure that vfs.maxbufspace is sufficient for our needs. From owner-freebsd-performance@FreeBSD.ORG Thu Aug 23 19:02:56 2012 Return-Path: Delivered-To: freebsd-performance@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 27BC4106564A; Thu, 23 Aug 2012 19:02:56 +0000 (UTC) (envelope-from alc@rice.edu) Received: from mh10.mail.rice.edu (mh10.mail.rice.edu [128.42.201.30]) by mx1.freebsd.org (Postfix) with ESMTP id D947A8FC08; Thu, 23 Aug 2012 19:02:55 +0000 (UTC) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id ED2AC604D9; Thu, 23 Aug 2012 14:02:54 -0500 (CDT) Received: from mh10.mail.rice.edu (localhost.localdomain [127.0.0.1]) by mh10.mail.rice.edu (Postfix) with ESMTP id EB54B604C8; Thu, 23 Aug 2012 14:02:54 -0500 (CDT) X-Virus-Scanned: by amavis-2.7.0 at mh10.mail.rice.edu, auth channel Received: from mh10.mail.rice.edu ([127.0.0.1]) by mh10.mail.rice.edu (mh10.mail.rice.edu [127.0.0.1]) (amavis, port 10026) with ESMTP id DhYj4pZTPCBD; Thu, 23 Aug 2012 14:02:54 -0500 (CDT) Received: from adsl-216-63-78-18.dsl.hstntx.swbell.net (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: alc) by mh10.mail.rice.edu (Postfix) with ESMTPSA id 63FF0604D8; Thu, 23 Aug 2012 14:02:54 -0500 (CDT) Message-ID: <50367E5D.1020702@rice.edu> Date: Thu, 23 Aug 2012 14:02:53 -0500 From: Alan Cox User-Agent: Mozilla/5.0 (X11; FreeBSD i386; rv:8.0) Gecko/20111113 Thunderbird/8.0 MIME-Version: 1.0 To: =?ISO-8859-1?Q?=22Gezeala_M=2E_Bacu=F1o_II=22?= References: <502DEAD9.6050304@zonov.org> <502EB081.3030801@rice.edu> <502FE98E.40807@rice.edu> <50325634.7090904@rice.edu> <503418C0.5000901@rice.edu> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: alc@freebsd.org, freebsd-performance@freebsd.org, Andrey Zonov , kib@freebsd.org Subject: Re: vm.kmem_size_max and vm.kmem_size capped at 329853485875 (~307GB) X-BeenThere: freebsd-performance@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Performance/tuning List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 23 Aug 2012 19:02:56 -0000 On 08/22/2012 12:09, Gezeala M. Bacuņo II wrote: > On Tue, Aug 21, 2012 at 4:24 PM, Alan Cox wrote: >> On 8/20/2012 8:26 PM, Gezeala M. Bacuņo II wrote: >>> On Mon, Aug 20, 2012 at 9:07 AM, Gezeala M. Bacuņo II >>> wrote: >>>> On Mon, Aug 20, 2012 at 8:22 AM, Alan Cox wrote: >>>>> On 08/18/2012 19:57, Gezeala M. Bacuņo II wrote: >>>>>> On Sat, Aug 18, 2012 at 12:14 PM, Alan Cox wrote: >>>>>>> On 08/17/2012 17:08, Gezeala M. Bacuņo II wrote: >>>>>>>> On Fri, Aug 17, 2012 at 1:58 PM, Alan Cox wrote: >>>>>>>>> vm.kmem_size controls the maximum size of the kernel's heap, i.e., >>>>>>>>> the >>>>>>>>> region where the kernel's slab and malloc()-like memory allocators >>>>>>>>> obtain >>>>>>>>> their memory. While this heap may occupy the largest portion of the >>>>>>>>> kernel's virtual address space, it cannot occupy the entirety of the >>>>>>>>> address >>>>>>>>> space. There are other things that must be given space within the >>>>>>>>> kernel's >>>>>>>>> address space, for example, the file system buffer map. >>>>>>>>> >>>>>>>>> ZFS does not, however, use the regular file system buffer cache. The >>>>>>>>> ARC >>>>>>>>> takes its place, and the ARC abuses the kernel's heap like nothing >>>>>>>>> else. >>>>>>>>> So, if you are running a machine that only makes trivial use of a >>>>>>>>> non-ZFS >>>>>>>>> file system, like you boot from UFS, but store all of your data in >>>>>>>>> ZFS, >>>>>>>>> then >>>>>>>>> you can dramatically reduce the size of the buffer map via boot >>>>>>>>> loader >>>>>>>>> tuneables and proportionately increase vm.kmem_size. >>>>>>>>> >>>>>>>>> Any further increases in the kernel virtual address space size will, >>>>>>>>> however, require code changes. Small changes, but changes >>>>>>>>> nonetheless. >>>>>>>>> >>>>>>>>> Alan >>>>>>>>> >>>> <> >>>>>>> Your objective should be to reduce the value of "sysctl >>>>>>> vfs.maxbufspace". >>>>>>> You can do this by setting the loader.conf tuneable "kern.maxbcache" >>>>>>> to >>>>>>> the >>>>>>> desired value. >>>>>>> >>>>>>> What does your machine currently report for "sysctl vfs.maxbufspace"? >>>>>>> >>>>>> Here you go: >>>>>> vfs.maxbufspace: 54967025664 >>>>>> kern.maxbcache: 0 >>>>> >>>>> Try setting kern.maxbcache to two billion and adding 50 billion to the >>>>> setting of vm.kmem_size{,_max}. >>>>> >>> 2 : 50 ==>> is this the ratio for further tuning >>> kern.maxbcache:vm.kmem_size? Is kern.maxbcache also in bytes? >>> >> No, this is not a ratio. Yes, kern.maxbcache is in bytes. Basically, for >> every byte that you subtract from vfs.maxbufspace, through setting >> kern.maxbcache, you can add a byte to vm.kmem_size{,_max}. >> >> Alan >> > Great! Thanks. Are there other sysctls aside from vfs.bufspace that I > should monitor for vfs.maxbufspace usage? I just want to make sure > that vfs.maxbufspace is sufficient for our needs. You might keep an eye on "sysctl vfs.bufdefragcnt". If it starts rapidly increasing, you may want to increase vfs.maxbufspace. Alan