From owner-freebsd-fs@FreeBSD.ORG Sun Jul 17 00:26:20 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E204106564A for ; Sun, 17 Jul 2011 00:26:20 +0000 (UTC) (envelope-from luke@digital-crocus.com) Received: from mail.digital-crocus.com (node2.digital-crocus.com [91.209.244.128]) by mx1.freebsd.org (Postfix) with ESMTP id 0B92B8FC0A for ; Sun, 17 Jul 2011 00:26:19 +0000 (UTC) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=dkselector; d=hybrid-logic.co.uk; h=Received:Received:Subject:From:Reply-To:To:Cc:In-Reply-To:References:Content-Type:Organization:Date:Message-ID:Mime-Version:X-Mailer:Content-Transfer-Encoding:X-Spam-Score:X-Digital-Crocus-Maillimit:X-Authenticated-Sender:X-Complaints:X-Admin:X-Abuse; b=vKzog1iHu3dv0DkC2r6M6+FtxEMSe2x+WQNy2r3u8vATIqE14BrqnQiX2BGRP1zuqkhh53C11ymsjO0dq3lCF41B0GwC5DYX+sv3EL3iOFmUHaFyJru5YfDFPQadfh6L; Received: from luke by mail.digital-crocus.com with local (Exim 4.69 (FreeBSD)) (envelope-from ) id 1QiFAn-000FH8-Lr for freebsd-fs@freebsd.org; Sun, 17 Jul 2011 01:25:29 +0100 Received: from 127cr.net ([78.105.122.99] helo=[192.168.1.23]) by mail.digital-crocus.com with esmtpa (Exim 4.69 (FreeBSD)) (envelope-from ) id 1QiFAm-000FGu-Mg; Sun, 17 Jul 2011 01:25:29 +0100 From: Luke Marsden To: Martin Matuska In-Reply-To: <4E20B8F3.5060603@FreeBSD.org> References: <1310733049.26698.69.camel@behemoth> <4E20B8F3.5060603@FreeBSD.org> Content-Type: text/plain; charset="UTF-8" Organization: Hybrid Web Cluster Date: Sun, 17 Jul 2011 01:26:15 +0100 Message-ID: <1310862375.23429.57.camel@pow> Mime-Version: 1.0 X-Mailer: Evolution 2.32.2 Content-Transfer-Encoding: 7bit X-Spam-Score: -1.0 X-Digital-Crocus-Maillimit: done X-Authenticated-Sender: luke X-Complaints: abuse@digital-crocus.com X-Admin: admin@digital-crocus.com X-Abuse: abuse@digital-crocus.com (Please include full headers in abuse reports) Cc: freebsd-fs@freebsd.org, tech@hybrid-logic.co.uk Subject: Re: Experiences with ZFS v28 - including deadlock X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: luke@hybrid-logic.co.uk List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Jul 2011 00:26:20 -0000 Hi Martin, Thank you for your email! On Sat, 2011-07-16 at 00:02 +0200, Martin Matuska wrote: > regarding the incremental receive, does the mount happen even if using > the "-u" option to the zfs receive command? > > The manpage for zfs (receive section) says: > > -u > File system that is associated with the received stream is not mounted. Thanks for the hint -- we're not using that option, good to know! It must be fairly new. Do you have any ideas about the deadlock? We saw much better performance on v28, so we're keen to put it into production, and that was the only deal-breaking issue we saw with it. If there's anything I can do to help track it down, let me know. -- Best Regards, Luke Marsden CTO, Hybrid Logic Ltd. Web: http://www.hybrid-cluster.com/ Hybrid Web Cluster - cloud web hosting Phone: +447791750420 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 11:02:42 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 94AF6106566B for ; Mon, 18 Jul 2011 11:02:42 +0000 (UTC) (envelope-from universite@ukr.net) Received: from otrada.od.ua (universite-1-pt.tunnel.tserv24.sto1.ipv6.he.net [IPv6:2001:470:27:140::2]) by mx1.freebsd.org (Postfix) with ESMTP id 1CB998FC18 for ; Mon, 18 Jul 2011 11:02:41 +0000 (UTC) Received: from [IPv6:2001:470:28:140:c1cb:7786:283:60de] ([IPv6:2001:470:28:140:c1cb:7786:283:60de]) (authenticated bits=0) by otrada.od.ua (8.14.4/8.14.4) with ESMTP id p6IB2Z8p070143 for ; Mon, 18 Jul 2011 14:02:35 +0300 (EEST) (envelope-from universite@ukr.net) Message-ID: <4E2412C2.5000202@ukr.net> Date: Mon, 18 Jul 2011 14:02:26 +0300 From: "Vladislav V. Prodan" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: fs@freebsd.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-92.0 required=5.0 tests=FREEMAIL_FROM,FSL_RU_URL, RDNS_NONE,SPF_SOFTFAIL,TO_NO_BRKTS_DIRECT,T_TO_NO_BRKTS_FREEMAIL, USER_IN_WHITELIST autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mary-teresa.otrada.od.ua X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (otrada.od.ua [IPv6:2001:470:28:140::5]); Mon, 18 Jul 2011 14:02:40 +0300 (EEST) Cc: Subject: [ZFS] Prompt, which is lost space in the ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 11:02:42 -0000 FreeBSD 8.2-STABLE #0: Tue Jun 28 14:40:44 EEST 2011 amd64 # zpool list tank NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT tank 1,34T 1,08T 270G 80% 1.00x ONLINE - # zdb tank Cached configuration: version: 15 name: 'tank' state: 0 txg: 3933453 pool_guid: 15415411259239146062 hostid: 814717323 hostname: 'second.xxx.com' vdev_tree: type: 'root' id: 0 guid: 15415411259239146062 children[0]: type: 'mirror' id: 0 guid: 16020562126957161505 whole_disk: 0 metaslab_array: 23 metaslab_shift: 33 ashift: 9 asize: 1483117166592 is_log: 0 children[0]: type: 'disk' id: 0 guid: 11217068100198816386 path: '/dev/gpt/disk0' whole_disk: 0 DTL: 120 children[1]: type: 'disk' id: 1 guid: 4665162630340381592 path: '/dev/gpt/disk1' whole_disk: 0 DTL: 118 MOS Configuration: version: 15 name: 'tank' state: 0 txg: 3933453 pool_guid: 15415411259239146062 hostid: 814717323 hostname: 'second.xxx.com' vdev_tree: type: 'root' id: 0 guid: 15415411259239146062 children[0]: type: 'mirror' id: 0 guid: 16020562126957161505 whole_disk: 0 metaslab_array: 23 metaslab_shift: 33 ashift: 9 asize: 1483117166592 is_log: 0 children[0]: type: 'disk' id: 0 guid: 11217068100198816386 path: '/dev/gpt/disk0' whole_disk: 0 DTL: 120 children[1]: type: 'disk' id: 1 guid: 4665162630340381592 path: '/dev/gpt/disk1' whole_disk: 0 DTL: 118 #df -h Filesystem Size Used Avail Capacity Mounted on tank 249G 846M 249G 0% / devfs 1.0k 1.0k 0B 100% /dev tank/backup 646G 397G 249G 61% /backup tank/backup/router 256G 7.9G 249G 3% /backup/router tank/backup/third-server 626G 377G 249G 60% /backup/third-server tank/home 252G 3.2G 249G 1% /home tank/tmp 249G 28M 249G 0% /tmp tank/usr 251G 2.5G 249G 1% /usr tank/usr/home 249G 29k 249G 0% /usr/home tank/usr/ports 250G 1.1G 249G 0% /usr/ports tank/usr/src 249G 314M 249G 0% /usr/src tank/var 249G 9.6M 249G 0% /var tank/var/crash 249G 21k 249G 0% /var/crash tank/var/db 249G 158M 249G 0% /var/db tank/mysql 250G 1.1G 249G 0% /var/db/mysql tank/mysql/ibdata 252G 3.2G 249G 1% /var/db/mysql/ibdata tank/mysql/iblogs 249G 10M 249G 0% /var/db/mysql/iblogs tank/var/log 249G 258M 249G 0% /var/log tank/var/mail 249G 734k 249G 0% /var/mail tank/var/run 249G 106k 249G 0% /var/run tank/var/tmp 249G 14M 249G 0% /var/tmp tank/www 270G 21G 249G 8% /www zfs list-t snapshot shows the place occupied by no more than 30GB Where lost about 600GB of free space? -- Vladislav V. Prodan VVP24-UANIC +380[67]4584408 +380[99]4060508 vlad11@jabber.ru From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 11:07:02 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 852201065711 for ; Mon, 18 Jul 2011 11:07:02 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 697278FC16 for ; Mon, 18 Jul 2011 11:07:02 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6IB72YW026769 for ; Mon, 18 Jul 2011 11:07:02 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6IB71am026767 for freebsd-fs@FreeBSD.org; Mon, 18 Jul 2011 11:07:01 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 18 Jul 2011 11:07:01 GMT Message-Id: <201107181107.p6IB71am026767@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 11:07:02 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/158839 fs [zfs] ZFS Bootloader Fails if there is a Dead Disk o kern/158802 fs [amd] amd(8) ICMP storm and unkillable process. o kern/158711 fs [ffs] [panic] panic in ffs_blkfree and ffs_valloc o kern/158231 fs [nullfs] panic on unmounting nullfs mounted over ufs o f kern/157929 fs [nfs] NFS slow read o kern/157728 fs [zfs] zfs (v28) incremental receive may leave behind t o kern/157722 fs [geli] unable to newfs a geli encrypted partition o kern/157399 fs [zfs] trouble with: mdconfig force delete && zfs strip o kern/157179 fs [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remov o kern/156933 fs [zfs] ZFS receive after read on readonly=on filesystem o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156168 fs [nfs] [panic] Kernel panic under concurrent access ove o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs o kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 o kern/154447 fs [zfs] [panic] Occasional panics - solaris assert somew p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153847 fs [nfs] [panic] Kernel panic from incorrect m_free in nf o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small p kern/152488 fs [tmpfs] [patch] mtime of file updated when only inode o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o kern/151845 fs [smbfs] [patch] smbfs should be upgraded to support Un o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/150207 fs zpool(1): zpool import -d /dev tries to open weird dev o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o bin/148296 fs [zfs] [loader] [patch] Very slow probe in /usr/src/sys o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133174 fs [msdosfs] [patch] msdosfs must support multibyte inter o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs f kern/130133 fs [panic] [zfs] 'kmem_map too small' caused by make clea o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs f kern/127375 fs [zfs] If vm.kmem_size_max>"1073741823" then write spee o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi f kern/126703 fs [panic] [zfs] _mtx_lock_sleep: recursed on non-recursi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files f sparc/123566 fs [zfs] zpool import issue: EOVERFLOW o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/120210 fs [zfs] [panic] reboot after panic: solaris assert: arc_ o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118126 fs [nfs] [patch] Poor NFS server write performance o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/33464 fs [ufs] soft update inconsistencies after system crash o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 234 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 12:00:27 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DDA4C1065686 for ; Mon, 18 Jul 2011 12:00:27 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 21CF78FC13 for ; Mon, 18 Jul 2011 12:00:27 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6IC0QpO077846 for ; Mon, 18 Jul 2011 12:00:26 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6IC0QgX077845; Mon, 18 Jul 2011 12:00:26 GMT (envelope-from gnats) Date: Mon, 18 Jul 2011 12:00:26 GMT Message-Id: <201107181200.p6IC0QgX077845@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/156933: [zfs] ZFS receive after read on readonly=on filesystem is corrupted without warning X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 12:00:28 -0000 The following reply was made to PR kern/156933; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, org_freebsd@L93.com Cc: Subject: Re: kern/156933: [zfs] ZFS receive after read on readonly=on filesystem is corrupted without warning Date: Mon, 18 Jul 2011 13:52:21 +0200 I am unable to reproduce this on ZFS v28. Can you give me detailed instructions and necessary files? -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 12:00:30 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F34C610656FB for ; Mon, 18 Jul 2011 12:00:30 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id C9D868FC1A for ; Mon, 18 Jul 2011 12:00:30 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6IC0UEv077876 for ; Mon, 18 Jul 2011 12:00:30 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6IC0UsR077875; Mon, 18 Jul 2011 12:00:30 GMT (envelope-from gnats) Date: Mon, 18 Jul 2011 12:00:30 GMT Message-Id: <201107181200.p6IC0UsR077875@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/150503: [zfs] ZFS disks are UNAVAIL and corrupted after reboot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 12:00:31 -0000 The following reply was made to PR kern/150503; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, william.franck@oceasys.net Cc: Subject: Re: kern/150503: [zfs] ZFS disks are UNAVAIL and corrupted after reboot Date: Mon, 18 Jul 2011 13:53:55 +0200 Any news on this issue? Does it also happen with ZFS v28 and latest 8-STABLE / 9-CURRENT? -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 12:00:34 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 35C46106578D for ; Mon, 18 Jul 2011 12:00:34 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 0C6D28FC16 for ; Mon, 18 Jul 2011 12:00:34 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6IC0XOc077920 for ; Mon, 18 Jul 2011 12:00:33 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6IC0XEb077919; Mon, 18 Jul 2011 12:00:33 GMT (envelope-from gnats) Date: Mon, 18 Jul 2011 12:00:33 GMT Message-Id: <201107181200.p6IC0XEb077919@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/150501: [zfs] ZFS vdev failure vdev.bad_label on amd64 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 12:00:34 -0000 The following reply was made to PR kern/150501; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, william.franck@oceasys.net Cc: Subject: Re: kern/150501: [zfs] ZFS vdev failure vdev.bad_label on amd64 Date: Mon, 18 Jul 2011 13:54:41 +0200 Any news on this issue? Does is still happen with v28 and latest 8-STABLE or 9-CURRENT? -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 12:00:44 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1F80A10656D0 for ; Mon, 18 Jul 2011 12:00:44 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EA0118FC15 for ; Mon, 18 Jul 2011 12:00:43 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6IC0h1Z078314 for ; Mon, 18 Jul 2011 12:00:43 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6IC0hsr078312; Mon, 18 Jul 2011 12:00:43 GMT (envelope-from gnats) Date: Mon, 18 Jul 2011 12:00:43 GMT Message-Id: <201107181200.p6IC0hsr078312@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/142914: [zfs] ZFS performance degradation over time X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 12:00:44 -0000 The following reply was made to PR kern/142914; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, miks.mikelsons@gmail.com Cc: Subject: Re: kern/142914: [zfs] ZFS performance degradation over time Date: Mon, 18 Jul 2011 13:57:00 +0200 Any news on this PR? Can we close it? Is the problem still present with ZFS v28 and latest 8-STABLE or 9-CURRENT? -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 19:37:38 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4CE0B1065670 for ; Mon, 18 Jul 2011 19:37:38 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id C6D638FC0A for ; Mon, 18 Jul 2011 19:37:37 +0000 (UTC) Received: by wyg24 with SMTP id 24so3078980wyg.13 for ; Mon, 18 Jul 2011 12:37:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=2D1RO8lADdd8mMs2DALLtYy+C7cySWVPXZ//1LrFVFA=; b=cCYbZkwXPXsZT7Iwr7dayVKFhvBE6e31iTjiQgc3zeAxMbAjlUccKprVPdh2YMd7Ry JJv1V45ZkweUCbyY3AqMmgL5zo3V0O3quxEq4rKQVEcqfFXj9yqJxIxi7UV4H5EPKMm1 uR0vBiNcS1yarsadsF4MkKTq9/BunIZGL8FIk= MIME-Version: 1.0 Received: by 10.216.235.95 with SMTP id t73mr5777895weq.10.1311016363411; Mon, 18 Jul 2011 12:12:43 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.216.46.18 with HTTP; Mon, 18 Jul 2011 12:12:43 -0700 (PDT) In-Reply-To: <4E2412C2.5000202@ukr.net> References: <4E2412C2.5000202@ukr.net> Date: Mon, 18 Jul 2011 12:12:43 -0700 X-Google-Sender-Auth: mpCXMuqNQH1OX3bz1kFV0NJF7HQ Message-ID: From: Artem Belevich To: "Vladislav V. Prodan" Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: fs@freebsd.org Subject: Re: [ZFS] Prompt, which is lost space in the ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 19:37:38 -0000 On Mon, Jul 18, 2011 at 4:02 AM, Vladislav V. Prodan w= rote: > FreeBSD 8.2-STABLE #0: Tue Jun 28 14:40:44 EEST 2011 amd64 > > # zpool list tank > NAME =A0 SIZE =A0ALLOC =A0 FREE =A0 =A0CAP =A0DEDUP =A0HEALTH =A0ALTROOT > tank =A01,34T =A01,08T =A0 270G =A0 =A080% =A01.00x =A0ONLINE =A0- > > # zdb tank > > Cached configuration: > =A0 =A0 =A0 =A0version: 15 > =A0 =A0 =A0 =A0name: 'tank' > =A0 =A0 =A0 =A0state: 0 > =A0 =A0 =A0 =A0txg: 3933453 > =A0 =A0 =A0 =A0pool_guid: 15415411259239146062 > =A0 =A0 =A0 =A0hostid: 814717323 > =A0 =A0 =A0 =A0hostname: 'second.xxx.com' > =A0 =A0 =A0 =A0vdev_tree: > =A0 =A0 =A0 =A0 =A0 =A0type: 'root' > =A0 =A0 =A0 =A0 =A0 =A0id: 0 > =A0 =A0 =A0 =A0 =A0 =A0guid: 15415411259239146062 > =A0 =A0 =A0 =A0 =A0 =A0children[0]: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: 'mirror' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0id: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0guid: 16020562126957161505 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0whole_disk: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0metaslab_array: 23 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0metaslab_shift: 33 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ashift: 9 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0asize: 1483117166592 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0is_log: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0children[0]: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: 'disk' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0id: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0guid: 11217068100198816386 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0path: '/dev/gpt/disk0' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0whole_disk: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DTL: 120 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0children[1]: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: 'disk' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0id: 1 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0guid: 4665162630340381592 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0path: '/dev/gpt/disk1' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0whole_disk: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DTL: 118 > > MOS Configuration: > =A0 =A0 =A0 =A0version: 15 > =A0 =A0 =A0 =A0name: 'tank' > =A0 =A0 =A0 =A0state: 0 > =A0 =A0 =A0 =A0txg: 3933453 > =A0 =A0 =A0 =A0pool_guid: 15415411259239146062 > =A0 =A0 =A0 =A0hostid: 814717323 > =A0 =A0 =A0 =A0hostname: 'second.xxx.com' > =A0 =A0 =A0 =A0vdev_tree: > =A0 =A0 =A0 =A0 =A0 =A0type: 'root' > =A0 =A0 =A0 =A0 =A0 =A0id: 0 > =A0 =A0 =A0 =A0 =A0 =A0guid: 15415411259239146062 > =A0 =A0 =A0 =A0 =A0 =A0children[0]: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: 'mirror' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0id: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0guid: 16020562126957161505 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0whole_disk: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0metaslab_array: 23 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0metaslab_shift: 33 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0ashift: 9 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0asize: 1483117166592 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0is_log: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0children[0]: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: 'disk' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0id: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0guid: 11217068100198816386 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0path: '/dev/gpt/disk0' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0whole_disk: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DTL: 120 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0children[1]: > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0type: 'disk' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0id: 1 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0guid: 4665162630340381592 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0path: '/dev/gpt/disk1' > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0whole_disk: 0 > =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0DTL: 118 > > #df -h > Filesystem =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0Size =A0 =A0Used =A0 Avail = Capacity =A0Mounted on > tank =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0846M =A0 = =A0249G =A0 =A0 0% =A0 =A0/ > devfs =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 1.0k =A0 =A01.0k =A0 = =A0 =A00B =A0 100% =A0 =A0/dev > tank/backup =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 646G =A0 =A0397G =A0 =A0249G = =A0 =A061% =A0 =A0/backup > tank/backup/router =A0 =A0 =A0 =A0 =A0256G =A0 =A07.9G =A0 =A0249G =A0 = =A0 3% =A0 =A0/backup/router > tank/backup/third-server =A0 =A0626G =A0 =A0377G =A0 =A0249G =A0 =A060% /= backup/third-server > tank/home =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 252G =A0 =A03.2G =A0 =A0249= G =A0 =A0 1% =A0 =A0/home > tank/tmp =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0 28M =A0 =A02= 49G =A0 =A0 0% =A0 =A0/tmp > tank/usr =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0251G =A0 =A02.5G =A0 =A02= 49G =A0 =A0 1% =A0 =A0/usr > tank/usr/home =A0 =A0 =A0 =A0 =A0 =A0 =A0 249G =A0 =A0 29k =A0 =A0249G = =A0 =A0 0% =A0 =A0/usr/home > tank/usr/ports =A0 =A0 =A0 =A0 =A0 =A0 =A0250G =A0 =A01.1G =A0 =A0249G = =A0 =A0 0% =A0 =A0/usr/ports > tank/usr/src =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0314M =A0 =A0249G = =A0 =A0 0% =A0 =A0/usr/src > tank/var =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A09.6M =A0 =A02= 49G =A0 =A0 0% =A0 =A0/var > tank/var/crash =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0 21k =A0 =A0249G = =A0 =A0 0% =A0 =A0/var/crash > tank/var/db =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 249G =A0 =A0158M =A0 =A0249G = =A0 =A0 0% =A0 =A0/var/db > tank/mysql =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0250G =A0 =A01.1G =A0 =A0249= G =A0 =A0 0% =A0 =A0/var/db/mysql > tank/mysql/ibdata =A0 =A0 =A0 =A0 =A0 252G =A0 =A03.2G =A0 =A0249G =A0 = =A0 1% /var/db/mysql/ibdata > tank/mysql/iblogs =A0 =A0 =A0 =A0 =A0 249G =A0 =A0 10M =A0 =A0249G =A0 = =A0 0% /var/db/mysql/iblogs > tank/var/log =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0258M =A0 =A0249G = =A0 =A0 0% =A0 =A0/var/log > tank/var/mail =A0 =A0 =A0 =A0 =A0 =A0 =A0 249G =A0 =A0734k =A0 =A0249G = =A0 =A0 0% =A0 =A0/var/mail > tank/var/run =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0106k =A0 =A0249G = =A0 =A0 0% =A0 =A0/var/run > tank/var/tmp =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0249G =A0 =A0 14M =A0 =A0249G = =A0 =A0 0% =A0 =A0/var/tmp > tank/www =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0270G =A0 =A0 21G =A0 =A02= 49G =A0 =A0 8% =A0 =A0/www > > zfs list-t snapshot shows the place occupied by no more than 30GB Try "zfs list -r -tall -o space tankl". It should give you detailed space usage for all filesystems and snapshots. > > Where lost about 600GB of free space? ... > tank/backup 646G 397G 249G 61% /backup > tank/backup/third-server 626G 377G 249G 60% /backup/third-ser= ver Perhaps here ^^^^^? --Artem > > -- > Vladislav V. Prodan > VVP24-UANIC > +380[67]4584408 > +380[99]4060508 > vlad11@jabber.ru > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 21:04:01 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9E1E106564A; Mon, 18 Jul 2011 21:04:01 +0000 (UTC) (envelope-from universite@ukr.net) Received: from otrada.od.ua (universite-1-pt.tunnel.tserv24.sto1.ipv6.he.net [IPv6:2001:470:27:140::2]) by mx1.freebsd.org (Postfix) with ESMTP id 31BC38FC15; Mon, 18 Jul 2011 21:04:00 +0000 (UTC) Received: from [IPv6:2001:470:28:140:3998:7877:6bb6:79a8] ([IPv6:2001:470:28:140:3998:7877:6bb6:79a8]) (authenticated bits=0) by otrada.od.ua (8.14.4/8.14.4) with ESMTP id p6IL3sBl088203; Tue, 19 Jul 2011 00:03:54 +0300 (EEST) (envelope-from universite@ukr.net) Message-ID: <4E249FAF.4050500@ukr.net> Date: Tue, 19 Jul 2011 00:03:43 +0300 From: "Vladislav V. Prodan" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: Artem Belevich References: <4E2412C2.5000202@ukr.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-95.5 required=5.0 tests=FREEMAIL_FROM,FSL_RU_URL, RDNS_NONE,SPF_SOFTFAIL,USER_IN_WHITELIST autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mary-teresa.otrada.od.ua X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (otrada.od.ua [IPv6:2001:470:28:140::5]); Tue, 19 Jul 2011 00:03:59 +0300 (EEST) Cc: fs@freebsd.org Subject: Re: [ZFS] Prompt, which is lost space in the ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 21:04:01 -0000 18.07.2011 22:12, Artem Belevich пишет: > # zpool list tank >> NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT >> tank 1,34T 1,08T 270G 80% 1.00x ONLINE - >> > tank/backup 646G 397G 249G 61% /backup >> > tank/backup/third-server 626G 377G 249G 60% /backup/third-server > Perhaps here ^^^^^? All 1340GB /backup 397GB /backup/third-server 377GB /www 21GB zfs snaphost 45GB free 249GB Sorry, did not properly calculated. But the question is relevant. Where are the 250GB? It is also unclear, as is the free space. # zpool list tank says 270GB # df -h says 249GB free -- Vladislav V. Prodan VVP24-UANIC +380[67]4584408 +380[99]4060508 vlad11@jabber.ru From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 21:29:04 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 14A5C106564A for ; Mon, 18 Jul 2011 21:29:04 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id A26518FC08 for ; Mon, 18 Jul 2011 21:29:03 +0000 (UTC) Received: by wyg24 with SMTP id 24so3157289wyg.13 for ; Mon, 18 Jul 2011 14:29:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=haNDSiUY30SIHkf/45WFjKczW02nRMmYnNV2tw/HvFs=; b=DHZaeR3JXGknSGeaNX28nsnNi3tI1eR3RVQ1JerQDpxevsU1eOlXt8l8tq9yZRBIp1 aZ/WPmXHACPWz47Hl4fwuc3kRVnCv0mmysrTxwlaymzzkKw9J+7XaIrfb2UaDyrRceYw tCK34llq0UrdgKx3r9c8Yf1VfI8hNQfca9xpU= MIME-Version: 1.0 Received: by 10.217.6.79 with SMTP id x57mr50981wes.10.1311024542497; Mon, 18 Jul 2011 14:29:02 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.216.46.18 with HTTP; Mon, 18 Jul 2011 14:29:02 -0700 (PDT) In-Reply-To: <4E249FAF.4050500@ukr.net> References: <4E2412C2.5000202@ukr.net> <4E249FAF.4050500@ukr.net> Date: Mon, 18 Jul 2011 14:29:02 -0700 X-Google-Sender-Auth: 12cByfHh-v82-MlnQIKGzF_QFdg Message-ID: From: Artem Belevich To: "Vladislav V. Prodan" Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: fs@freebsd.org Subject: Re: [ZFS] Prompt, which is lost space in the ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 21:29:04 -0000 On Mon, Jul 18, 2011 at 2:03 PM, Vladislav V. Prodan w= rote: > 18.07.2011 22:12, Artem Belevich =D0=BF=D0=B8=D1=88=D0=B5=D1=82: >> >> # zpool list tank >>> >>> =C2=A0NAME =C2=A0 SIZE =C2=A0ALLOC =C2=A0 FREE =C2=A0 =C2=A0CAP =C2=A0D= EDUP =C2=A0HEALTH =C2=A0ALTROOT >>> =C2=A0tank =C2=A01,34T =C2=A01,08T =C2=A0 270G =C2=A0 =C2=A080% =C2=A01= .00x =C2=A0ONLINE =C2=A0- > >>> > =C2=A0tank/backup =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 646G =C2=A0 =C2=A0397G =C2=A0 =C2=A0249G =C2=A0 =C2=A061% =C2=A0 =C2= =A0/backup >>> > =C2=A0tank/backup/third-server =C2=A0 =C2=A0626G =C2=A0 =C2=A0377G = =C2=A0 =C2=A0249G =C2=A0 =C2=A060% >>> > /backup/third-server >> >> Perhaps here ^^^^^? > > All =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= 1340GB > /backup =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 397GB > /backup/third-server =C2=A0 =C2=A0377GB > /www =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A021GB > zfs snaphost =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A045GB > free =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0249GB > > Sorry, did not properly calculated. > But the question is relevant. Where are the 250GB? You've started with 600GB missing. Now it's 250GB. Looks like we've just found 350GB. :-) What does "zfs list -r -tall -o space tank" show? df does not give you complete picture. > > It is also unclear, as is the free space. > # zpool list tank says 270GB zpool gives you raw numbers. That's what filesystem layer uses to store its= data > # df -h says 249GB free "zfs list" or df would give you amount of usable disk space that filesystem would make available to user. My guess is that this number would be more relevant in most cases than the number zpool gives you. --Artem > > > -- > Vladislav V. Prodan > VVP24-UANIC > +380[67]4584408 > +380[99]4060508 > vlad11@jabber.ru > From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 21:37:33 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 865F91065670; Mon, 18 Jul 2011 21:37:33 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 5EB458FC0A; Mon, 18 Jul 2011 21:37:33 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6ILbXfH030974; Mon, 18 Jul 2011 21:37:33 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6ILbXPd030970; Mon, 18 Jul 2011 21:37:33 GMT (envelope-from linimon) Date: Mon, 18 Jul 2011 21:37:33 GMT Message-Id: <201107182137.p6ILbXPd030970@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 21:37:33 -0000 Old Synopsis: zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 New Synopsis: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jul 18 21:37:15 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=159010 From owner-freebsd-fs@FreeBSD.ORG Mon Jul 18 22:50:18 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81F67106566B for ; Mon, 18 Jul 2011 22:50:18 +0000 (UTC) (envelope-from mike@votesmart.org) Received: from mail.votesmart.org (smokey.vote-smart.org [12.32.42.180]) by mx1.freebsd.org (Postfix) with SMTP id 11A1C8FC16 for ; Mon, 18 Jul 2011 22:50:16 +0000 (UTC) Received: (qmail 65197 invoked by uid 98); 18 Jul 2011 16:23:35 -0600 Received: from 192.168.255.27 (mike@192.168.255.27) by mallo.votesmart.org (envelope-from , uid 82) with qmail-scanner-2.01 (clamdscan: 0.97/13111. spamassassin: 3.3.1. Clear:RC:1(192.168.255.27):. Processed in 0.029277 secs); 18 Jul 2011 22:23:35 -0000 Received: from unknown (HELO ?192.168.255.27?) (mike@192.168.255.27) by mail.votesmart.org with SMTP; 18 Jul 2011 16:23:35 -0600 Message-ID: <4E24B266.9050108@votesmart.org> Date: Mon, 18 Jul 2011 16:23:34 -0600 From: Mike Shultz Organization: Project Vote Smart User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.18) Gecko/20110621 Thunderbird/3.1.11 MIME-Version: 1.0 To: freebsd-fs@freebsd.org X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Clinton Adams Subject: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jul 2011 22:50:18 -0000 I ran into an issue today of our server thinking that it was being flooded and locking our nfs users out. Got a LOT of these messages: Jul 12 16:08:22 xxxxx kernel: nfsd server cache flooded, try to increase nfsrc_floodlevel Our server(`uname -a`): FreeBSD xxxxx 8.2-RELEASE-p2 FreeBSD 8.2-RELEASE-p2 #0: Tue Jun 21 16:52:27 MDT 2011 yyy@xxxxx:/usr/obj/usr/src/sys/XXXXX amd64 I could find no information on nfsrc_floodlevel other than source code which didn't explain too much about it. I don't know if it's a kernel config var, or what. `nfsstat -e` did show this: CacheSize TCPPeak 16385 16385 So I'm guessing that that is the current cache limit. The source code and this output suggest that we're just running into the limit. However, a comment in that source does suggest that "The cache will still function over flood level" but that doesn't seem to be the case. I ended up having to revoke the clients and restarting nfsd to get it operational again. I would appreciate anyone that could clarify what nfsrc_floodlevel is and how to go about changing it. -- Mike Shultz Information Technology Assistant Project Vote Smart Phone: 406-859-8683 Toll Free: 1-888-VOTE-SMART Jabber/Gtalk: shultzm@gmail.com Key Server: pgp.mit.edu From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 01:50:10 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3F06E106566B for ; Tue, 19 Jul 2011 01:50:10 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 243E58FC22 for ; Tue, 19 Jul 2011 01:50:10 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6J1oA7a061440 for ; Tue, 19 Jul 2011 01:50:10 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6J1o9Nc061439; Tue, 19 Jul 2011 01:50:09 GMT (envelope-from gnats) Date: Tue, 19 Jul 2011 01:50:09 GMT Message-Id: <201107190150.p6J1o9Nc061439@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Xin LI Cc: Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Xin LI List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 01:50:10 -0000 The following reply was made to PR kern/159010; it has been noted by GNATS. From: Xin LI To: bug-followup@FreeBSD.org, gerrit.kuehn@aei.mpg.de Cc: Pawel Jakub Dawidek , Martin Matuska Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 Date: Mon, 18 Jul 2011 18:43:10 -0700 This is a multi-part message in MIME format. --------------080303040908050700010603 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hi, Would you please try if the attached patch fixes the problem? Cheers, - -- Xin LI https://www.delphij.net/ FreeBSD - The Power to Serve! Live free or die -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iQEcBAEBCAAGBQJOJOEuAAoJEATO+BI/yjfB1jwH/3qM8a6z7mAcV4kDOT9Y02zb Z7ETklaUY47HeLaYYd/Rf9xfqHufJ3Uh8XZRKYN2VFDHxSEoDDfKWqLm3RNzXISn UHZFcZwuW2Cxj7s3PVAYx6a/3jcTuT+0gxyLh+u3bSCnH5Y/6gqrNY7czRXDb7Nq 4oatwM8cE1wvMgTFVfKgloA3yFld9B2ppCLBez3kMtf8moR61eBgTb5mdXQj4Gc+ 221MPTMMI0DmbWID8e5dJbMALlZa5Y6UnkBJFAZVkSMnQ6subzHXLHelJIecyJUP U4otUuzItXIA1mBRTjqQ6Rh5YXOKalQvkUb4Cn+S+w6QFOY3zsvm+Cp+FqgPh/o= =0RK6 -----END PGP SIGNATURE----- --------------080303040908050700010603 Content-Type: text/plain; name="zfs-dev.diff" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="zfs-dev.diff" SW5kZXg6IHN5cy9jZGRsL2NvbnRyaWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9mcy96ZnMv emZzX3pub2RlLmMKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PQotLS0gc3lzL2NkZGwvY29udHJpYi9vcGVuc29s YXJpcy91dHMvY29tbW9uL2ZzL3pmcy96ZnNfem5vZGUuYwkocmV2aXNpb24gMjI0MTc0KQor Kysgc3lzL2NkZGwvY29udHJpYi9vcGVuc29sYXJpcy91dHMvY29tbW9uL2ZzL3pmcy96ZnNf em5vZGUuYwkod29ya2luZyBjb3B5KQpAQCAtNzAwLDYgKzcwMCwxNyBAQCB6ZnNfem5vZGVf YWxsb2MoemZzdmZzX3QgKnpmc3ZmcywgZG11X2J1Zl90ICpkYiwgaQogCWNhc2UgVkRJUjoK IAkJenAtPnpfem5fcHJlZmV0Y2ggPSBCX1RSVUU7IC8qIHpfcHJlZmV0Y2ggZGVmYXVsdCBp cyBlbmFibGVkICovCiAJCWJyZWFrOworCWNhc2UgVkJMSzoKKwljYXNlIFZDSFI6CisJCXsK KwkJCXVpbnQ2NF90IHJkZXY7CisJCQlWRVJJRlkoc2FfbG9va3VwKHpwLT56X3NhX2hkbCwg U0FfWlBMX1JERVYoemZzdmZzKSwKKwkJCSAgICAmcmRldiwgc2l6ZW9mIChyZGV2KSkgPT0g MCk7CisKKwkJCXpwLT56X3JkZXYgPSB6ZnNfY21wbGRldihyZGV2KTsKKwkJfQorLy8JCXZw LT52X29wID0gJnpmc19mdm5vZGVvcHM7CisJCWJyZWFrOwogCWNhc2UgVkZJRk86CiAJCXZw LT52X29wID0gJnpmc19maWZvb3BzOwogCQlicmVhazsKSW5kZXg6IHN5cy9jZGRsL2NvbnRy aWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9mcy96ZnMvemZzX3Zub3BzLmMKPT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PQotLS0gc3lzL2NkZGwvY29udHJpYi9vcGVuc29sYXJpcy91dHMvY29tbW9uL2ZzL3pm cy96ZnNfdm5vcHMuYwkocmV2aXNpb24gMjI0MTc0KQorKysgc3lzL2NkZGwvY29udHJpYi9v cGVuc29sYXJpcy91dHMvY29tbW9uL2ZzL3pmcy96ZnNfdm5vcHMuYwkod29ya2luZyBjb3B5 KQpAQCAtMjY5NCw3ICsyNjk0LDcgQEAgemZzX2dldGF0dHIodm5vZGVfdCAqdnAsIHZhdHRy X3QgKnZhcCwgaW50IGZsYWdzLAogCXZhcC0+dmFfbmxpbmsgPSBNSU4obGlua3MsIFVJTlQz Ml9NQVgpOwkvKiBubGlua190IGxpbWl0ISAqLwogCXZhcC0+dmFfc2l6ZSA9IHpwLT56X3Np emU7CiAJdmFwLT52YV9mc2lkID0gdnAtPnZfbW91bnQtPm1udF9zdGF0LmZfZnNpZC52YWxb MF07Ci0vLwl2YXAtPnZhX3JkZXYgPSB6ZnNfY21wbGRldihwenAtPnpwX3JkZXYpOworCXZh cC0+dmFfcmRldiA9IHpwLT56X3JkZXY7CiAJdmFwLT52YV9zZXEgPSB6cC0+el9zZXE7CiAJ dmFwLT52YV9mbGFncyA9IDA7CS8qIEZyZWVCU0Q6IFJlc2V0IGNoZmxhZ3MoMikgZmxhZ3Mu ICovCiAKSW5kZXg6IHN5cy9jZGRsL2NvbnRyaWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9m cy96ZnMvc3lzL3pmc196bm9kZS5oCj09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09 PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT0KLS0tIHN5cy9jZGRsL2NvbnRy aWIvb3BlbnNvbGFyaXMvdXRzL2NvbW1vbi9mcy96ZnMvc3lzL3pmc196bm9kZS5oCShyZXZp c2lvbiAyMjQxNzQpCisrKyBzeXMvY2RkbC9jb250cmliL29wZW5zb2xhcmlzL3V0cy9jb21t b24vZnMvemZzL3N5cy96ZnNfem5vZGUuaAkod29ya2luZyBjb3B5KQpAQCAtMjA5LDYgKzIw OSw3IEBAIHR5cGVkZWYgc3RydWN0IHpub2RlIHsKIAlib29sZWFuX3QJel9pc19zYTsJLyog YXJlIHdlIG5hdGl2ZSBzYT8gKi8KIAkvKiBGcmVlQlNELXNwZWNpZmljIGZpZWxkLiAqLwog CXN0cnVjdCB0YXNrCXpfdGFzazsKKwlkZXZfdAkJel9yZGV2OwogfSB6bm9kZV90OwogCiAK --------------080303040908050700010603-- From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 10:20:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7563D1065676 for ; Tue, 19 Jul 2011 10:20:52 +0000 (UTC) (envelope-from ady@ady.ro) Received: from mail-ey0-f176.google.com (mail-ey0-f176.google.com [209.85.215.176]) by mx1.freebsd.org (Postfix) with ESMTP id 152078FC1E for ; Tue, 19 Jul 2011 10:20:51 +0000 (UTC) Received: by eya28 with SMTP id 28so3241468eya.21 for ; Tue, 19 Jul 2011 03:20:51 -0700 (PDT) Received: by 10.14.127.194 with SMTP id d42mr2612651eei.161.1311070850248; Tue, 19 Jul 2011 03:20:50 -0700 (PDT) MIME-Version: 1.0 Sender: ady@ady.ro Received: by 10.14.28.16 with HTTP; Tue, 19 Jul 2011 03:20:30 -0700 (PDT) In-Reply-To: <20110617034547.GA97087@icarus.home.lan> References: <4DFAB27B.7030402@jlauser.net> <20110617034547.GA97087@icarus.home.lan> From: Adrian Penisoara Date: Tue, 19 Jul 2011 12:20:30 +0200 X-Google-Sender-Auth: -JiNwFJLF8Q9kTbkqPdcU1sbVgs Message-ID: To: Jeremy Chadwick Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Another zfs sharenfs issue X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 10:20:52 -0000 Hi, Sorry for coming back to an old thread, but I just wanted to underline the idea below. On Fri, Jun 17, 2011 at 5:45 AM, Jeremy Chadwick wrote: > On Thu, Jun 16, 2011 at 09:48:43PM -0400, James L. Lauser wrote: [...] >> Any insight would be appreciated, though seeing as how I only >> normally reboot the server about 4 times per year, this isn't >> exactly a very high priority issue. > > On our FreeBSD (RELENG_8-based) NFS filer for our local network, we > never bothered with the "sharenfs" attribute of the filesystems because, > simply put, it didn't seem to work reliably. =A0We use /etc/exports > natively and everything Just Works(tm). =A0We've had literally zero > problems over the years with this method, and have rebooted the filer > numerous times without any repercussions on the client side. > > Given that this is the 2nd "sharenfs is wonky" thread in the past few > hours, I'm left wondering why people bother with it and don't just use > /etc/exports. Think about operational maintenance -- e.g. modifying or destroying a NFS-shared ZFS dataset would immediately adjust sharing for it and if you have a lot of datasets or do a lot of operations like these it saves time not to mangle with /etc/exports every time. It's more elegant this way. Same things should apply for the sharesmb property. And I'm pretty sure the (Open)Solaris folks thought of other usage scenarios. Regards, Adrian Penisoara EnterpriseBSD.com From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 12:29:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE26D106566B; Tue, 19 Jul 2011 12:29:00 +0000 (UTC) (envelope-from D.Forsyth@ru.ac.za) Received: from d.mail.ru.ac.za (d.mail.ru.ac.za [IPv6:2001:4200:1010::25:4]) by mx1.freebsd.org (Postfix) with ESMTP id DE4518FC1A; Tue, 19 Jul 2011 12:28:59 +0000 (UTC) Received: from iwr.ru.ac.za ([146.231.64.249]:59788) by d.mail.ru.ac.za with esmtp (Exim 4.75 (FreeBSD)) (envelope-from ) id 1Qj9Q1-000GDB-Hn; Tue, 19 Jul 2011 14:28:57 +0200 Received: from iwdf-5.iwr.ru.ac.za ([146.231.64.28]) by iwr.ru.ac.za with esmtp (Exim 4.76 (FreeBSD)) (envelope-from ) id 1Qj9Q1-000BBB-G1; Tue, 19 Jul 2011 14:28:57 +0200 From: "DA Forsyth" Organization: IWR To: freebsd-fs@freebsd.org Date: Tue, 19 Jul 2011 14:28:57 +0200 MIME-Version: 1.0 Message-ID: <4E257889.12343.39F99213@d.forsyth.ru.ac.za> Priority: normal X-mailer: Pegasus Mail for Windows (4.52) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body X-Virus-Scanned: d.mail.ru.ac.za (146.231.129.36) Cc: freebsd-questions@freebsd.org Subject: How to fix bad superblock on UFS2? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: d.forsyth@ru.ac.za List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 12:29:00 -0000 Hi all I had a drive let out smoke, it was part of a 4 drive RAID5 array on an intel Matrix motherboard controller (device ar). Having fought various battles just to get the machine to boot again (had to upgrade to 7.4 from 7.2 to do it because of a panic in ataraid.c) I now have some partitions reporting superblock problems. Havign googled around this topic for some hours now, and having tried copying one or more of the backup superblocks to the primary and secondary (at block 160), I still get.... ** /dev/ar0s1f BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE it then asks if it must look for alternates but claims 32 is not one and stops. All my partitions are UFS2 so why doesn't it look at block 160, which 'newfs -N' finds correctly as the next superblock copy? So, how do I fix this? Also, why does fsck_ufs prompt to update the primary superblock when you give it an alternate with '-b xx', and then not do it? I have now tried booting from the FreeBSD 8.2 live CD in the hopes that the most recent fsck will actually fix this, but it does not. One thing I found in my web searching is that there is confusion over block sizes. 'newfs -N' appears to report sector counts as block addresses. Doing dd if=/dev/ar0s1fbs=512 skip=160 count=16 | hd -v | grep "54 19" bears this out as the output does indeed contain the correct magic number. 'fsck_ufs -b 160 /dev...' also works as expected, but then you try 'fsck /dev/...' and it will report the bad superblock, and then fail to find any backup superblocks, which newfs managed just fine, and this might be because the disk thinks blocks are 16384 in size. Thanks -- DA Fo rsyth Network Supervisor Principal Technical Officer -- Institute for Water Research http://www.ru.ac.za/institutes/iwr/ From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 13:15:28 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C3DD3106564A for ; Tue, 19 Jul 2011 13:15:28 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 81BBC8FC08 for ; Tue, 19 Jul 2011 13:15:28 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EACuDJU6DaFvO/2dsb2JhbABUG4Qvo3iIfLE9kSqBK4QCgQ8EkmeILYhJ X-IronPort-AV: E=Sophos;i="4.67,228,1309752000"; d="scan'208";a="127719159" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 19 Jul 2011 09:15:27 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 9D6C9B3F85; Tue, 19 Jul 2011 09:15:27 -0400 (EDT) Date: Tue, 19 Jul 2011 09:15:27 -0400 (EDT) From: Rick Macklem To: Mike Shultz Message-ID: <752938116.734332.1311081327632.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <4E24B266.9050108@votesmart.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, Clinton Adams Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 13:15:28 -0000 Mike Shultz wrote: > I ran into an issue today of our server thinking that it was being > flooded and locking our nfs users out. Got a LOT of these messages: > > Jul 12 16:08:22 xxxxx kernel: nfsd server cache flooded, try to > increase > nfsrc_floodlevel > > Our server(`uname -a`): FreeBSD xxxxx 8.2-RELEASE-p2 FreeBSD > 8.2-RELEASE-p2 #0: Tue Jun 21 16:52:27 MDT 2011 > yyy@xxxxx:/usr/obj/usr/src/sys/XXXXX amd64 > > I could find no information on nfsrc_floodlevel other than source code > which didn't explain too much about it. I don't know if it's a kernel > config var, or what. > > `nfsstat -e` did show this: > > CacheSize TCPPeak > 16385 16385 > > So I'm guessing that that is the current cache limit. > > The source code and this output suggest that we're just running into > the > limit. However, a comment in that source does suggest that "The cache > will still function over flood level" but that doesn't seem to be the > case. I ended up having to revoke the clients and restarting nfsd to > get > it operational again. > Since you were seeing the messages "...try increasing flood level" it means that at least some of your client(s) are using NFSv4. For NFSv4, the client will get NFS4ERR_RESOURCE back as a reply at this point. My guess is that the client(s) just kept sending retries of the RPCs and, since the cache size didn't decrease, just kept getting NFS4ERR_RESOURCE. The real question becomes "how did it hit the flood level?". Hmm, there was a recent SMP related cache problem that is fixed by this patch: http://people.freebsd.org/~rmacklem/cache.patch I'd suggest you try this patch and see if the problem occurs again. It seems unlikely you would hit the flood level unless there is a bug (or very weird client behaviour), but it's conceivable. You can increase it by editting sys/fs/nfs/nfs.h and increasing the value, then rebuilding a kernel/modules. Also, what client(s) are mounting the server and how many/how busy are they? Hopefully the SMP patch will fix this for you, although it's hard to predict what behaviour could be observed without the patch. (Btw, the patch is in head and stable/8, but not releng 8.2.) I only have single core hardware for testing, so I'd never see these kinds of bugs myself. > I would appreciate anyone that could clarify what nfsrc_floodlevel is > and how to go about changing it. > This is mostly a "sanity check" and it's hard to imagine hitting the limit of 16K. To hit this without some sort of bug or client/server interoperability issue would take something like 4000 TCP mounts against the server. However, the only issue with respect to increasing it is running out of mbufs. You can try increasing it, as above, and then use `nfssstat -e -s` to monitor how it grows. If it keeps growing, then there is definitely a bug or interoperability problem. If it just seems to peak at some level, I`d like to hear what that level is and what kind of load the server would have at that time. The only other thing that I can think of that might result in hitting the limit is a TCP stack which allows a large amount of unACKed data, since a cache entry is discarded when the client's TCP layer ACKs past the TCP seq# for the reply sent to the client.) Good luck with it and please let me know how it goes, rick From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 14:29:29 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AAA81106566C for ; Tue, 19 Jul 2011 14:29:29 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 6ABD78FC14 for ; Tue, 19 Jul 2011 14:29:29 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAJSTJU6DaFvO/2dsb2JhbABUG4Qvo3m7H5EygSuEAoEPBJJniC2ISQ X-IronPort-AV: E=Sophos;i="4.67,228,1309752000"; d="scan'208";a="127732162" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 19 Jul 2011 10:29:28 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 88D1CB3F95; Tue, 19 Jul 2011 10:29:28 -0400 (EDT) Date: Tue, 19 Jul 2011 10:29:28 -0400 (EDT) From: Rick Macklem To: Mike Shultz Message-ID: <2043464565.740270.1311085768541.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <752938116.734332.1311081327632.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, Clinton Adams Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 14:29:29 -0000 replying to my own message: > Mike Shultz wrote: > > I ran into an issue today of our server thinking that it was being > > flooded and locking our nfs users out. Got a LOT of these messages: > > > > Jul 12 16:08:22 xxxxx kernel: nfsd server cache flooded, try to > > increase > > nfsrc_floodlevel > > > > Our server(`uname -a`): FreeBSD xxxxx 8.2-RELEASE-p2 FreeBSD > > 8.2-RELEASE-p2 #0: Tue Jun 21 16:52:27 MDT 2011 > > yyy@xxxxx:/usr/obj/usr/src/sys/XXXXX amd64 > > > > I could find no information on nfsrc_floodlevel other than source > > code > > which didn't explain too much about it. I don't know if it's a > > kernel > > config var, or what. > > > > `nfsstat -e` did show this: > > > > CacheSize TCPPeak > > 16385 16385 > > > > So I'm guessing that that is the current cache limit. > > > > The source code and this output suggest that we're just running into > > the > > limit. However, a comment in that source does suggest that "The > > cache > > will still function over flood level" but that doesn't seem to be > > the > > case. I ended up having to revoke the clients and restarting nfsd to > > get > > it operational again. > > > Since you were seeing the messages "...try increasing flood level" it > means > that at least some of your client(s) are using NFSv4. For NFSv4, the > client > will get NFS4ERR_RESOURCE back as a reply at this point. My guess is > that > the client(s) just kept sending retries of the RPCs and, since the > cache size > didn't decrease, just kept getting NFS4ERR_RESOURCE. > > The real question becomes "how did it hit the flood level?". > > Hmm, there was a recent SMP related cache problem that is fixed by > this patch: > http://people.freebsd.org/~rmacklem/cache.patch > > I'd suggest you try this patch and see if the problem occurs again. > > It seems unlikely you would hit the flood level unless there is a bug > (or very > weird client behaviour), but it's conceivable. > > You can increase it by editting sys/fs/nfs/nfs.h and increasing the > value, then > rebuilding a kernel/modules. > > Also, what client(s) are mounting the server and how many/how busy are > they? > > Hopefully the SMP patch will fix this for you, although it's hard to > predict > what behaviour could be observed without the patch. (Btw, the patch is > in head > and stable/8, but not releng 8.2.) I only have single core hardware > for testing, > so I'd never see these kinds of bugs myself. > > > I would appreciate anyone that could clarify what nfsrc_floodlevel > > is > > and how to go about changing it. > > > This is mostly a "sanity check" and it's hard to imagine hitting the > limit > of 16K. To hit this without some sort of bug or client/server > interoperability > issue would take something like 4000 TCP mounts against the server. Oops, my error here. For NFSv4 (I was thinking NFSv3 over TCP mounts), the cache can grow much larger per mount, because it needs to hold onto the last reply for each open_owner (this depends upon the client, but if you think of an open_owner as a process that opened at least one file on the server, that would be the FreeBSD NFSv4 client). Do an "nfsstat -e -s" and compare the "Cachesize" with the "OpenOwner" count. If they are about the same, I suspect you may be hitting the limit because of this and you'll need to bump it up. The OpenOwners don't go away for a while (that's a whole other topic), so the cached last replies are stuck for a while, too. In summary: Since you are using NFSv4 mounts, you could hit the 16K flood level with far fewer mounts than the 4000 mentioned above. I can see 200+ cached entries for each NFSv4 mount for some situations. (For NFSv3 over TCP, I guessed 4, so you can see how big the difference is.) A patch to add a sysctl to change the flood level would be easy to add, if it becomes apparent that bumping this up is something you need to do routinely. (Changing it "on the fly" should be safe, but you'd need to use a debugger to do that now.) rick ps: If it just needs increasing then "Congratulations, you are the first site I know of to use NFSv4 a significant amount".:-) From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 14:40:11 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A18C1106566B for ; Tue, 19 Jul 2011 14:40:11 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 904A88FC0C for ; Tue, 19 Jul 2011 14:40:11 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6JEeBtl008216 for ; Tue, 19 Jul 2011 14:40:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6JEeBJR008215; Tue, 19 Jul 2011 14:40:11 GMT (envelope-from gnats) Date: Tue, 19 Jul 2011 14:40:11 GMT Message-Id: <201107191440.p6JEeBJR008215@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Gerrit =?ISO-8859-1?Q?K=FChn?= Cc: Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Gerrit =?ISO-8859-1?Q?K=FChn?= List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 14:40:11 -0000 The following reply was made to PR kern/159010; it has been noted by GNATS. From: Gerrit =?ISO-8859-1?Q?K=FChn?= To: bug-followup@FreeBSD.org Cc: Xin LI Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 Date: Tue, 19 Jul 2011 16:20:07 +0200 Hi, The patch was rejected by patch(1) (was it against -stable or -current?), but I changed the relevant files by hand and recompiled the zfs kernel module. Good news: everything appears to be back to normal now. The dev nodes look ok on the filesystem now, and I can boot the diskless linux clients from zfs volumes over nfs again. If I have a voice, this patch should go into -stable and -current asap. :-) Thank you very much for your quick and professional support! cu Gerrit From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 18:58:29 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 552A6106567A for ; Tue, 19 Jul 2011 18:58:29 +0000 (UTC) (envelope-from giffunip@tutopia.com) Received: from nm2-vm0.bullet.mail.sp2.yahoo.com (nm2-vm0.bullet.mail.sp2.yahoo.com [98.139.91.248]) by mx1.freebsd.org (Postfix) with SMTP id 311168FC1A for ; Tue, 19 Jul 2011 18:58:29 +0000 (UTC) Received: from [98.139.91.63] by nm2.bullet.mail.sp2.yahoo.com with NNFMP; 19 Jul 2011 18:46:03 -0000 Received: from [98.139.91.5] by tm3.bullet.mail.sp2.yahoo.com with NNFMP; 19 Jul 2011 18:46:03 -0000 Received: from [127.0.0.1] by omp1005.mail.sp2.yahoo.com with NNFMP; 19 Jul 2011 18:46:03 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 500442.38442.bm@omp1005.mail.sp2.yahoo.com Received: (qmail 66151 invoked by uid 60001); 19 Jul 2011 18:46:02 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1311101162; bh=e0FtQc+CZuWIL4WzkMkL2bcDX7ZBDUe5ip34sKUeOF4=; h=X-YMail-OSG:Received:X-RocketYMMF:X-Mailer:Message-ID:Date:From:Reply-To:Subject:To:MIME-Version:Content-Type; b=zT+Mmbh529N919s+HqZXuPPGKWLvLjwJVHU5DJwWOmpqZz2H2RPkY/y0VRKBecM4UwrEVlwCUIMajKdfhOQsYqtYMa45zfjTn/hwrYHzFRPf2T/nWF9XIaCLcskfJ7+gcFXN6Vo0u5ljrk+DnbostkR19zU6Z9Mex+qp90GxeJc= X-YMail-OSG: kItUDzsVM1kCVs6NobcN.UJD7nsk7BiLC8GKq3ngAVsN48Z k6EBzF8SLVRA0H_V0qMuqjj1d5_9burDD_jh8XqY8gv7sZGmfCsOZJjlI7IE rldprNA33R0aJBbsYf6dZ4hY4JPn7NtVzTecQV2Q0p1oUnRrjOLxw21CX9Hs wI799wApvZ98yMyYYEov5__2ZR5inKH0vcqgRDy0YXo5gXurtEikGydnZKA. ejGG9gUw9UU5wN69tPBJKLxCAbHb.AuPm8pZSXmr8h2vbusezki0JozbcHec Z.N3iqdfmoJ4l9KR3.RFLWXWITPs3_WFDRCMQjHmzwxktEmvY1J_MdIddxrI 6sCs6X6Eb.18CNWlf4RQDrmAE.OPPrFQqPXgqPWovylpO.cH.4xLsMqZSXgx H3xGIHtrqbfpMVJpcSed8lj_WsFJBsfV30qZDGQTeaSFNpifof8.szF0w7zX K4lodLiAXbl0Qe3FcRd6vtPFMIpVDpwUbcewD47FgmqiL2RgoHF.y.YXIxm1 QegmiYJMxkZM935Uxawh_8nCs Received: from [190.157.142.22] by web113515.mail.gq1.yahoo.com via HTTP; Tue, 19 Jul 2011 11:46:02 PDT X-RocketYMMF: giffunip X-Mailer: YahooMailClassic/14.0.3 YahooMailWebService/0.8.112.307740 Message-ID: <1311101162.65947.YahooMailClassic@web113515.mail.gq1.yahoo.com> Date: Tue, 19 Jul 2011 11:46:02 -0700 (PDT) From: "Pedro F. Giffuni" To: freebsd-fs@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Subject: tcplay: an 100% compatible BSD implementation of TrueCrypt. X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: giffunip@tutopia.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 18:58:29 -0000 Hi, Just couldn't resist mentioning this that I find extremely cool: Alex Hornung seems to have done an awesome job writing his own implementation of TrueCrypt under a BSD license for DragonflyBSD: http://leaf.dragonflybsd.org/mailarchive/kernel/2011-07/msg00028.html Hopefully some of our geom wizards will find inspiration and bring it to FreeBSD too! :) cheers, Pedro. From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 21:06:28 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4EB3F1065672; Tue, 19 Jul 2011 21:06:28 +0000 (UTC) (envelope-from universite@ukr.net) Received: from otrada.od.ua (universite-1-pt.tunnel.tserv24.sto1.ipv6.he.net [IPv6:2001:470:27:140::2]) by mx1.freebsd.org (Postfix) with ESMTP id B05D88FC12; Tue, 19 Jul 2011 21:06:27 +0000 (UTC) Received: from [IPv6:2001:470:28:140:48b7:5b69:1e72:c170] ([IPv6:2001:470:28:140:48b7:5b69:1e72:c170]) (authenticated bits=0) by otrada.od.ua (8.14.4/8.14.4) with ESMTP id p6JL6LIM076408; Wed, 20 Jul 2011 00:06:21 +0300 (EEST) (envelope-from universite@ukr.net) Message-ID: <4E25F1C0.6060404@ukr.net> Date: Wed, 20 Jul 2011 00:06:08 +0300 From: "Vladislav V. Prodan" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: Artem Belevich References: <4E2412C2.5000202@ukr.net> <4E249FAF.4050500@ukr.net> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-95.5 required=5.0 tests=FREEMAIL_FROM,FSL_RU_URL, RDNS_NONE,SPF_SOFTFAIL,USER_IN_WHITELIST autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mary-teresa.otrada.od.ua X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (otrada.od.ua [IPv6:2001:470:28:140::5]); Wed, 20 Jul 2011 00:06:26 +0300 (EEST) Cc: fs@freebsd.org Subject: Re: [ZFS] Prompt, which is lost space in the ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 21:06:28 -0000 19.07.2011 0:29, Artem Belevich wrote: > What does "zfs list -r -tall -o space tank" show? df does not give you > complete picture. Sorry. Was to blame for /backup. Now rewriting the hierarchy /backup with the option dedup=on to ZFSv28 May be can save some space. -- Vladislav V. Prodan VVP24-UANIC +380[67]4584408 +380[99]4060508 vlad11@jabber.ru From owner-freebsd-fs@FreeBSD.ORG Tue Jul 19 23:09:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0D490106566B for ; Tue, 19 Jul 2011 23:09:41 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id BD2C88FC17 for ; Tue, 19 Jul 2011 23:09:40 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAHINJk6DaFvO/2dsb2JhbABUG4QvpAi7OpEtgSuEAoEPBJJoiC2ISQ X-IronPort-AV: E=Sophos;i="4.67,231,1309752000"; d="scan'208";a="131626738" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 19 Jul 2011 19:09:39 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id A6BB4B3F1F; Tue, 19 Jul 2011 19:09:39 -0400 (EDT) Date: Tue, 19 Jul 2011 19:09:39 -0400 (EDT) From: Rick Macklem To: Mike Shultz Message-ID: <732996961.772204.1311116979640.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <4E24B266.9050108@votesmart.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org, Clinton Adams Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 19 Jul 2011 23:09:41 -0000 It's me again: > I ran into an issue today of our server thinking that it was being > flooded and locking our nfs users out. Got a LOT of these messages: > > Jul 12 16:08:22 xxxxx kernel: nfsd server cache flooded, try to > increase > nfsrc_floodlevel > > Our server(`uname -a`): FreeBSD xxxxx 8.2-RELEASE-p2 FreeBSD > 8.2-RELEASE-p2 #0: Tue Jun 21 16:52:27 MDT 2011 > yyy@xxxxx:/usr/obj/usr/src/sys/XXXXX amd64 > > I could find no information on nfsrc_floodlevel other than source code > which didn't explain too much about it. I don't know if it's a kernel > config var, or what. > > `nfsstat -e` did show this: > > CacheSize TCPPeak > 16385 16385 > > So I'm guessing that that is the current cache limit. > > The source code and this output suggest that we're just running into > the > limit. However, a comment in that source does suggest that "The cache > will still function over flood level" but that doesn't seem to be the > case. I ended up having to revoke the clients and restarting nfsd to > get > it operational again. > I've created a patch that gets rid of open_owners (and their associated cached replies) agressively when the count hits about 90% of nfsrc_floodlevel. A quick test here indicates it allows the server to "recover" without hitting the flood gate. Note that this is allowed by the NFSv4.0 RFC, so the change doesn't break the protocol. Please try the patch, which is at: http://people.freebsd.org/~rmacklem/noopen.patch (This patch is against the file in -current, so patch may not like it, but it should be easy to do by hand, if patch fails.) Again, good luck with it and please let me know how it goes, rick From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 00:04:34 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0AC3F106564A; Wed, 20 Jul 2011 00:04:34 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id D01ED8FC0C; Wed, 20 Jul 2011 00:04:33 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6K04Xt4030440; Wed, 20 Jul 2011 00:04:33 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6K04Xb6030413; Wed, 20 Jul 2011 00:04:33 GMT (envelope-from linimon) Date: Wed, 20 Jul 2011 00:04:33 GMT Message-Id: <201107200004.p6K04Xb6030413@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/159048: [smbfs] smb mount corrupts large files X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 00:04:34 -0000 Synopsis: [smbfs] smb mount corrupts large files Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Jul 20 00:04:27 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=159048 From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 04:34:09 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD0A61065675; Wed, 20 Jul 2011 04:34:09 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B64868FC12; Wed, 20 Jul 2011 04:34:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6K4Y9ZF080912; Wed, 20 Jul 2011 04:34:09 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6K4Y9nZ080908; Wed, 20 Jul 2011 04:34:09 GMT (envelope-from linimon) Date: Wed, 20 Jul 2011 04:34:09 GMT Message-Id: <201107200434.p6K4Y9nZ080908@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/159045: [zfs] [hang] ZFS scrub freezes system X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 04:34:10 -0000 Old Synopsis: ZFS scrub freezes system New Synopsis: [zfs] [hang] ZFS scrub freezes system Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Jul 20 04:33:51 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=159045 From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 06:38:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64DBB106566B; Wed, 20 Jul 2011 06:38:41 +0000 (UTC) (envelope-from maxim.konovalov@gmail.com) Received: from mp2.macomnet.net (ipv6.irc.int.ru [IPv6:2a02:28:1:2::1b:2]) by mx1.freebsd.org (Postfix) with ESMTP id C5FC38FC0A; Wed, 20 Jul 2011 06:38:40 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mp2.macomnet.net (8.14.4/8.14.3) with ESMTP id p6K6cb4V085372; Wed, 20 Jul 2011 10:38:38 +0400 (MSD) (envelope-from maxim.konovalov@gmail.com) Date: Wed, 20 Jul 2011 10:38:37 +0400 (MSD) From: Maxim Konovalov To: DA Forsyth In-Reply-To: <4E257889.12343.39F99213@d.forsyth.ru.ac.za> Message-ID: References: <4E257889.12343.39F99213@d.forsyth.ru.ac.za> User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: How to fix bad superblock on UFS2? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 06:38:41 -0000 Try to use tools/tools/find-sb to locate superblocks. -- Maxim Konovalov From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 08:54:59 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 413ED106566C; Wed, 20 Jul 2011 08:54:59 +0000 (UTC) (envelope-from D.Forsyth@ru.ac.za) Received: from d.mail.ru.ac.za (d.mail.ru.ac.za [IPv6:2001:4200:1010::25:4]) by mx1.freebsd.org (Postfix) with ESMTP id 534578FC0C; Wed, 20 Jul 2011 08:54:58 +0000 (UTC) Received: from iwr.ru.ac.za ([146.231.64.249]:59225) by d.mail.ru.ac.za with esmtp (Exim 4.75 (FreeBSD)) (envelope-from ) id 1QjSYS-0003Xm-4P; Wed, 20 Jul 2011 10:54:56 +0200 Received: from iwdf-5.iwr.ru.ac.za ([146.231.64.28]) by iwr.ru.ac.za with esmtp (Exim 4.76 (FreeBSD)) (envelope-from ) id 1QjSYS-0003iO-2F; Wed, 20 Jul 2011 10:54:56 +0200 From: "DA Forsyth" Organization: IWR To: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Date: Wed, 20 Jul 2011 10:54:56 +0200 MIME-Version: 1.0 Message-ID: <4E2697E0.27289.3E5BF8BB@d.forsyth.ru.ac.za> Priority: normal In-reply-to: References: <4E257889.12343.39F99213@d.forsyth.ru.ac.za>, X-mailer: Pegasus Mail for Windows (4.52) Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7BIT Content-description: Mail message body X-Virus-Scanned: d.mail.ru.ac.za (146.231.129.36) Cc: Subject: Re: How to fix bad superblock on UFS2? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: d.forsyth@ru.ac.za List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 08:54:59 -0000 On 20 Jul 2011 , Maxim Konovalov entreated about "Re: How to fix bad superblock on UFS2?": > Try to use tools/tools/find-sb to locate superblocks. Thankyou Maxim I may yet need to use that on another partition, but last night I achieved some success by hacking fsck_ffs to display what it is doing. By doing this I found that it considers the 'first alternate' superblock to be the one in the LAST cylinder group. So, by using dd to copy a working superblock to block 128 and to the last one listed by 'newfs -N', fsck_ffs could then actually recover some files. Since I probably broke more things on this partition than were broken by the 'disk smoke event', I was not surprised when only about half the drives files showed up in lost+found and the primary folder is now empty (the whole drive is a Samba share with quotas, so I create a folder to share so that users cannot mess with the quota.* files). Not a problem for this partition as I have a full level 0 dump. I now have 2 more partitions to resurrect.... both report 'Cannot find file system superblock' though 'newfs -N' shows a sensible list of them, so I have hope. But, further thanks to you for pointing out find-sb, because in googling for that I found various other very useful things, including http://www.chakraborty.ch/tag/raid-filesystem-partition-recovery-ufs- freebsd/ which at the least, points out things to avoid doing (-: thanks -- DA Fo rsyth Network Supervisor Principal Technical Officer -- Institute for Water Research http://www.ru.ac.za/institutes/iwr/ From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 09:30:14 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F441106564A for ; Wed, 20 Jul 2011 09:30:14 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 8EEAB8FC12 for ; Wed, 20 Jul 2011 09:30:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6K9UEXG081030 for ; Wed, 20 Jul 2011 09:30:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6K9UEIw081027; Wed, 20 Jul 2011 09:30:14 GMT (envelope-from gnats) Date: Wed, 20 Jul 2011 09:30:14 GMT Message-Id: <201107200930.p6K9UEIw081027@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 09:30:14 -0000 The following reply was made to PR kern/159010; it has been noted by GNATS. From: Martin Matuska To: d@delphij.net Cc: Xin LI , bug-followup@FreeBSD.org, gerrit.kuehn@aei.mpg.de, Pawel Jakub Dawidek Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 Date: Wed, 20 Jul 2011 11:29:34 +0200 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I agree to Xin's patch and confirm it working. Dňa 19.07.2011 03:43, Xin LI wrote / napísal(a): > Hi, > > Would you please try if the attached patch fixes the problem? > > Cheers, - -- Martin Matuska FreeBSD committer http://blog.vx.sk -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4mn/gACgkQp2uLA0JhsNEIhgCgiHPUphuzKUm6yysbvoH9z9au jMYAniPlHGloF+ubr05eGGdpYOUeEsJ/ =SVT7 -----END PGP SIGNATURE----- From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 13:03:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4AB5D1065672 for ; Wed, 20 Jul 2011 13:03:48 +0000 (UTC) (envelope-from clinton.adams@gmail.com) Received: from mail-ey0-f176.google.com (mail-ey0-f176.google.com [209.85.215.176]) by mx1.freebsd.org (Postfix) with ESMTP id D112B8FC22 for ; Wed, 20 Jul 2011 13:03:47 +0000 (UTC) Received: by eya28 with SMTP id 28so912688eya.21 for ; Wed, 20 Jul 2011 06:03:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; bh=A7O1VTOuo9u8CoSIHbcmjAWumgv3Xovhi7XtH+3my+Y=; b=keD3EvRdRHJSP4cev7K/wKEDqxRKSuvK+dtVr3a+Y+ppZuzWg8m/dkNDMDs3JuN4/o 0yOJbPO0Ldh67WVSZbxzjb3BoG6p+sHsToYwZ+JV7aw/RrsMLIoSPXI4xe/BspHwrSch wXyGVqmBgdBC6rlS1jKIJafq2bd/yWG+TBC3Y= MIME-Version: 1.0 Received: by 10.14.22.11 with SMTP id s11mr893171ees.195.1311165229591; Wed, 20 Jul 2011 05:33:49 -0700 (PDT) Received: by 10.14.22.76 with HTTP; Wed, 20 Jul 2011 05:33:49 -0700 (PDT) In-Reply-To: <732996961.772204.1311116979640.JavaMail.root@erie.cs.uoguelph.ca> References: <4E24B266.9050108@votesmart.org> <732996961.772204.1311116979640.JavaMail.root@erie.cs.uoguelph.ca> Date: Wed, 20 Jul 2011 14:33:49 +0200 Message-ID: From: Clinton Adams To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 13:03:48 -0000 On Wed, Jul 20, 2011 at 1:09 AM, Rick Macklem wrote: > Please try the patch, which is at: > =A0 http://people.freebsd.org/~rmacklem/noopen.patch > (This patch is against the file in -current, so patch may not like it, bu= t > =A0it should be easy to do by hand, if patch fails.) > > Again, good luck with it and please let me know how it goes, rick > Thanks for your help with this, trying the patches now. Tests with one client look good so far, values for OpenOwner and CacheSize are more in line, we'll test with more clients later today. We were hitting the nfsrc_floodlevel with just three clients before, all using nfs4 mounted home and other directories. Clients are running Ubuntu 10.04.2 LTS. Usage has been general desktop usage, nothing unusual that we could see. Relevant snippet of /proc/mounts on client (rsize,wsize are being automatically negotiated, not specified in the automount options): pez.votesmart.org:/public /export/public nfs4 rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,proto=3D= tcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,minorve= rsion=3D0,addr=3D192.168.255.25 0 0 pez.votesmart.org:/home/clinton /home/clinton nfs4 rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,proto=3D= tcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,minorve= rsion=3D0,addr=3D192.168.255.25 0 0 nfsstat -e -s, with patches, after some stress testing: Server Info: Getattr Setattr Lookup Readlink Read Write Create Re= move 95334 1 28004 50 297125 2 0 = 0 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Ac= cess 0 0 0 0 0 1242 0 = 1444 Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetCl= IdCf 0 0 0 0 2 0 4 = 4 Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH = Lock 176735 0 0 21175 0 0 49171 = 0 LockT LockU Close Verify NVerify PutFH PutPubFH PutRo= otFH 0 0 21184 0 0 549853 0 = 17 Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create 0 21186 176735 0 0 0 Server: Retfailed Faults Clients 0 0 1 OpenOwner Opens LockOwner Locks Delegs 291 2 0 0 0 Server Cache Stats: Inprog Idem Non-idem Misses CacheSize TCPPeak 0 0 0 549969 291 2827 nfsstat -e -s, prior to patches, general usage: Server Info: Getattr Setattr Lookup Readlink Read Write Create Re= move 2813477 62661 382636 1419 837492 2115422 0 3= 3976 Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Ac= cess 31164 1310 0 0 0 15678 10 30= 7236 Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetCl= IdCf 0 0 2 1 144550 0 43 = 43 Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH = Lock 1462595 0 595 11267 0 0 550761 28= 0674 LockT LockU Close Verify NVerify PutFH PutPubFH PutRo= otFH 155 212299 286615 0 0 6651006 0 = 1234 Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create 256784 320761 1495805 0 0 738 Server: Retfailed Faults Clients 0 0 3 OpenOwner Opens LockOwner Locks Delegs 6 178 8012 2 0 Server Cache Stats: Inprog Idem Non-idem Misses CacheSize TCPPeak 0 0 96 6876610 8084 13429 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 13:29:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7A0F6106567B for ; Wed, 20 Jul 2011 13:29:48 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 3492C8FC17 for ; Wed, 20 Jul 2011 13:29:47 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAB7XJk6DaFvO/2dsb2JhbABKCRuEL6QPiHyvWZETgSuCA4IAgQ8Ekm6IMIhJ X-IronPort-AV: E=Sophos;i="4.67,235,1309752000"; d="scan'208";a="131677485" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 20 Jul 2011 09:29:46 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id E86D5B3F27; Wed, 20 Jul 2011 09:29:46 -0400 (EDT) Date: Wed, 20 Jul 2011 09:29:46 -0400 (EDT) From: Rick Macklem To: Clinton Adams Message-ID: <1005230198.785386.1311168586930.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 13:29:48 -0000 Clinton Adams wrote: > On Wed, Jul 20, 2011 at 1:09 AM, Rick Macklem > wrote: > > Please try the patch, which is at: > > =C2=A0 http://people.freebsd.org/~rmacklem/noopen.patch > > (This patch is against the file in -current, so patch may not like > > it, but > > =C2=A0it should be easy to do by hand, if patch fails.) > > > > Again, good luck with it and please let me know how it goes, rick > > >=20 > Thanks for your help with this, trying the patches now. Tests with one > client look good so far, values for OpenOwner and CacheSize are more > in line, we'll test with more clients later today. We were hitting the > nfsrc_floodlevel with just three clients before, all using nfs4 > mounted home and other directories. Clients are running Ubuntu 10.04.2 > LTS. Usage has been general desktop usage, nothing unusual that we > could see. >=20 > Relevant snippet of /proc/mounts on client (rsize,wsize are being > automatically negotiated, not specified in the automount options): > pez.votesmart.org:/public /export/public nfs4 > rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,proto= =3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,mino= rversion=3D0,addr=3D192.168.255.25 > 0 0 > pez.votesmart.org:/home/clinton /home/clinton nfs4 > rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,proto= =3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,mino= rversion=3D0,addr=3D192.168.255.25 > 0 0 >=20 > nfsstat -e -s, with patches, after some stress testing: > Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 95334 1 28004 50 297125 2 0 0 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 0 0 0 0 0 1242 0 1444 > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > 0 0 0 0 2 0 4 4 > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > 176735 0 0 21175 0 0 49171 0 > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > 0 0 21184 0 0 549853 0 17 > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > 0 21186 176735 0 0 0 > Server: > Retfailed Faults Clients > 0 0 1 > OpenOwner Opens LockOwner Locks Delegs > 291 2 0 0 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 0 549969 291 2827 >=20 Yes, these stats look reasonable. (and sorry if the mail system I use munged the whitespace) > nfsstat -e -s, prior to patches, general usage: >=20 > Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 2813477 62661 382636 1419 837492 2115422 0 33976 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 31164 1310 0 0 0 15678 10 307236 > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > 0 0 2 1 144550 0 43 43 > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > 1462595 0 595 11267 0 0 550761 280674 > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > 155 212299 286615 0 0 6651006 0 1234 > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > 256784 320761 1495805 0 0 738 > Server: > Retfailed Faults Clients > 0 0 3 > OpenOwner Opens LockOwner Locks Delegs > 6 178 8012 2 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 96 6876610 8084 13429 >=20 Hmm. LockOwners have the same property as OpenOwners in that the server is required to hold onto the last reply in the cache until the Open/Lock Owner is released. Unfortunately, a server can't release a LockOwner until either the client issues a ReleaseLockOwner operation to tell the server that it will no longer use the LockOwner or the open is closed. These stats suggest that the client tried to do byte range locking over 8000 times with different LockOwners (I don't know how the Linux client decided to use a different LockOwner?), for file(s) that were still open. (When I test using the Fedora15 client, I do see ReleaseLockOwner operations, but usually just before a close. I don't know how recently that was added to the Linux client. ReleaseLockOwner was added just before the RFC was published to try and deal with a situation where the client uses a lot of LockOwners that the server must hold onto until the file is closed. If this is legitimate, all that can be done is increase NFSRVCACHE_FLOODLEVEL and hope that you can find a value large enough that the clients don't bump into it without exhausting mbufs. (I'd increase "kern.ipc.nmbclusters" to something larger than what you set NFSRVCACHE_FLOODLEEVEL to.) However, I suspect the 8084 LockOwners is a result of some other problem. Fingers and toes crossed that it was a side effect of the cache SMP bugs fixed by cache.patch. (noopen.patch won't help for this case, because it appears to be lockowners and not openowners that are holding the cached entries, but it won't do any harm, either.) If you see very large LockOwner counts again, with the patched kernel, all I can suggest is doing a packet capture and emailing it to me. "tcpdump -s 0 -w xxx" run for a short enough time=20 that "xxx" isn't huge when run on the server might catch some issue (like the client retrying a lock over and over and over again). A packet capture might also show if the Ubuntu client is doing ReleaseLockOwner operations. (Btw, you can look at the trace using wireshark, which knows about NFSv4.) In summary, It'll be interesting to see how this goes, rick ps: Sorry about the long winded reply, but this is nfsv4 after all:-) From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 16:54:46 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 69789106566C; Wed, 20 Jul 2011 16:54:46 +0000 (UTC) (envelope-from delphij@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 4126F8FC14; Wed, 20 Jul 2011 16:54:46 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6KGskd6096263; Wed, 20 Jul 2011 16:54:46 GMT (envelope-from delphij@freefall.freebsd.org) Received: (from delphij@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6KGsjpc096259; Wed, 20 Jul 2011 16:54:45 GMT (envelope-from delphij) Date: Wed, 20 Jul 2011 16:54:45 GMT Message-Id: <201107201654.p6KGsjpc096259@freefall.freebsd.org> To: gerrit.kuehn@aei.mpg.de, delphij@FreeBSD.org, freebsd-fs@FreeBSD.org, delphij@FreeBSD.org From: delphij@FreeBSD.org Cc: Subject: Re: kern/159010: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 16:54:46 -0000 Synopsis: [zfs] zfsv28 serves bad device nodes via nfs on FreeBSD 8.2 State-Changed-From-To: open->patched State-Changed-By: delphij State-Changed-When: Wed Jul 20 16:53:38 UTC 2011 State-Changed-Why: A fix have been committed to -HEAD (slightly different from the patch in reply). Pending MFC reminder. Responsible-Changed-From-To: freebsd-fs->delphij Responsible-Changed-By: delphij Responsible-Changed-When: Wed Jul 20 16:53:38 UTC 2011 Responsible-Changed-Why: Take. http://www.freebsd.org/cgi/query-pr.cgi?pr=159010 From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 18:48:41 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6B10C10656D4 for ; Wed, 20 Jul 2011 18:48:41 +0000 (UTC) (envelope-from zack.kirsch@isilon.com) Received: from seaxch10.isilon.com (seaxch10.isilon.com [74.85.160.26]) by mx1.freebsd.org (Postfix) with ESMTP id D24DE8FC1F for ; Wed, 20 Jul 2011 18:48:11 +0000 (UTC) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Date: Wed, 20 Jul 2011 11:36:06 -0700 Message-ID: <476FC2247D6C7843A4814ED64344560C04443EAA@seaxch10.desktop.isilon.com> In-Reply-To: <1005230198.785386.1311168586930.JavaMail.root@erie.cs.uoguelph.ca> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: nfsd server cache flooded, try to increase nfsrc_floodlevel Thread-Index: AcxG4Sis9jifkBKqRV6pDckLMfdCIwAKM+Ug References: <1005230198.785386.1311168586930.JavaMail.root@erie.cs.uoguelph.ca> From: "Zack Kirsch" To: "Rick Macklem" , "Clinton Adams" Cc: freebsd-fs@freebsd.org Subject: RE: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 18:48:41 -0000 SnVzdCB3YW50ZWQgdG8gYWRkIGEgYml0IG9mIElzaWxvbiBjb2xvci4gV2UndmUgaGl0IHRoaXMg bGltaXQgYmVmb3JlLCBidXQgSSBiZWxpZXZlIGl0IHdhcyBtb3N0bHkgZHVlIHRvIHN0cmFuZ2Ug Y2xpZW50IGJlaGF2aW9yIG9mIDEpIFVzaW5nIGEgbmV3IGxvY2tvd25lciBmb3IgZWFjaCBsb2Nr IGFuZCAyKSBVc2luZyBhIG5ldyBUQ1AgY29ubmVjdGlvbiBmb3IgZWFjaCAndGVzdCBydW4nLiBB cyBmYXIgYXMgSSBrbm93LCB3ZSBoYXZlbid0IGhpdCB0aGlzIGluIHRoZSBmaWVsZC4NCg0KV2Un dmUgZG9uZSBhIGZldyB0aGluZ3MgdG8gY29tYmF0IHRoaXMgcHJvYmxlbToNCjEpIFdlIGluY3Jl YXNlZCB0aGUgZmxvb2RsZXZlbCB0byA2NTUzNi4NCjIpIFdlIG1hZGUgdGhlIGZsb29kbGV2ZWwg Y29uZmlndXJhYmxlIHZpYSBzeXNjdGwuIA0KMykgV2UgbWFkZSBzaWduaWZpY2FudCBjaGFuZ2Vz IHRvIHRoZSByZXBsYXkgY2FjaGUgaXRzZWxmLiBTcGVjaWZpYyBnYWlucyB3ZXJlIGRyYXN0aWMg cGVyZm9ybWFuY2UgaW1wcm92ZW1lbnRzIGFuZCBmcmVlaW5nIG9mIGNhY2hlIGVudHJpZXMgZnJv bSBzdGFsZSBUQ1AgY29ubmVjdGlvbnMuDQoNCkknZCBsaWtlIHRvIHVwc3RyZWFtIGFsbCBvZiB0 aGlzLCBidXQgaXQgd2lsbCB0YWtlIHNvbWUgdGltZSwgYW5kIG9idmlvdXNseSB3b24ndCBoYXBw ZW4gdW50aWwgc3RhYmxlOSBicmFuY2hlcy4NCg0KWmFjaw0KDQotLS0tLU9yaWdpbmFsIE1lc3Nh Z2UtLS0tLQ0KRnJvbTogb3duZXItZnJlZWJzZC1mc0BmcmVlYnNkLm9yZyBbbWFpbHRvOm93bmVy LWZyZWVic2QtZnNAZnJlZWJzZC5vcmddIE9uIEJlaGFsZiBPZiBSaWNrIE1hY2tsZW0NClNlbnQ6 IFdlZG5lc2RheSwgSnVseSAyMCwgMjAxMSA2OjMwIEFNDQpUbzogQ2xpbnRvbiBBZGFtcw0KQ2M6 IGZyZWVic2QtZnNAZnJlZWJzZC5vcmcNClN1YmplY3Q6IFJlOiBuZnNkIHNlcnZlciBjYWNoZSBm bG9vZGVkLCB0cnkgdG8gaW5jcmVhc2UgbmZzcmNfZmxvb2RsZXZlbA0KDQpDbGludG9uIEFkYW1z IHdyb3RlOg0KPiBPbiBXZWQsIEp1bCAyMCwgMjAxMSBhdCAxOjA5IEFNLCBSaWNrIE1hY2tsZW0g PHJtYWNrbGVtQHVvZ3VlbHBoLmNhPg0KPiB3cm90ZToNCj4gPiBQbGVhc2UgdHJ5IHRoZSBwYXRj aCwgd2hpY2ggaXMgYXQ6DQo+ID4gwqAgaHR0cDovL3Blb3BsZS5mcmVlYnNkLm9yZy9+cm1hY2ts ZW0vbm9vcGVuLnBhdGNoDQo+ID4gKFRoaXMgcGF0Y2ggaXMgYWdhaW5zdCB0aGUgZmlsZSBpbiAt Y3VycmVudCwgc28gcGF0Y2ggbWF5IG5vdCBsaWtlIA0KPiA+IGl0LCBidXQNCj4gPiDCoGl0IHNo b3VsZCBiZSBlYXN5IHRvIGRvIGJ5IGhhbmQsIGlmIHBhdGNoIGZhaWxzLikNCj4gPg0KPiA+IEFn YWluLCBnb29kIGx1Y2sgd2l0aCBpdCBhbmQgcGxlYXNlIGxldCBtZSBrbm93IGhvdyBpdCBnb2Vz LCByaWNrDQo+ID4NCj4gDQo+IFRoYW5rcyBmb3IgeW91ciBoZWxwIHdpdGggdGhpcywgdHJ5aW5n IHRoZSBwYXRjaGVzIG5vdy4gVGVzdHMgd2l0aCBvbmUgDQo+IGNsaWVudCBsb29rIGdvb2Qgc28g ZmFyLCB2YWx1ZXMgZm9yIE9wZW5Pd25lciBhbmQgQ2FjaGVTaXplIGFyZSBtb3JlIA0KPiBpbiBs aW5lLCB3ZSdsbCB0ZXN0IHdpdGggbW9yZSBjbGllbnRzIGxhdGVyIHRvZGF5LiBXZSB3ZXJlIGhp dHRpbmcgdGhlIA0KPiBuZnNyY19mbG9vZGxldmVsIHdpdGgganVzdCB0aHJlZSBjbGllbnRzIGJl Zm9yZSwgYWxsIHVzaW5nIG5mczQgDQo+IG1vdW50ZWQgaG9tZSBhbmQgb3RoZXIgZGlyZWN0b3Jp ZXMuIENsaWVudHMgYXJlIHJ1bm5pbmcgVWJ1bnR1IDEwLjA0LjIgDQo+IExUUy4gVXNhZ2UgaGFz IGJlZW4gZ2VuZXJhbCBkZXNrdG9wIHVzYWdlLCBub3RoaW5nIHVudXN1YWwgdGhhdCB3ZSANCj4g Y291bGQgc2VlLg0KPiANCj4gUmVsZXZhbnQgc25pcHBldCBvZiAvcHJvYy9tb3VudHMgb24gY2xp ZW50IChyc2l6ZSx3c2l6ZSBhcmUgYmVpbmcgDQo+IGF1dG9tYXRpY2FsbHkgbmVnb3RpYXRlZCwg bm90IHNwZWNpZmllZCBpbiB0aGUgYXV0b21vdW50IG9wdGlvbnMpOg0KPiBwZXoudm90ZXNtYXJ0 Lm9yZzovcHVibGljIC9leHBvcnQvcHVibGljIG5mczQNCj4gcncscmVsYXRpbWUsdmVycz00LHJz aXplPTY1NTM2LHdzaXplPTY1NTM2LG5hbWxlbj0yNTUsaGFyZCxwcm90bz10Y3AsdA0KPiBpbWVv PTYwMCxyZXRyYW5zPTIsc2VjPWtyYjUsY2xpZW50YWRkcj0xOTIuMTY4LjI1NS4xMTIsbWlub3J2 ZXJzaW9uPTAsDQo+IGFkZHI9MTkyLjE2OC4yNTUuMjUNCj4gMCAwDQo+IHBlei52b3Rlc21hcnQu b3JnOi9ob21lL2NsaW50b24gL2hvbWUvY2xpbnRvbiBuZnM0DQo+IHJ3LHJlbGF0aW1lLHZlcnM9 NCxyc2l6ZT02NTUzNix3c2l6ZT02NTUzNixuYW1sZW49MjU1LGhhcmQscHJvdG89dGNwLHQNCj4g aW1lbz02MDAscmV0cmFucz0yLHNlYz1rcmI1LGNsaWVudGFkZHI9MTkyLjE2OC4yNTUuMTEyLG1p bm9ydmVyc2lvbj0wLA0KPiBhZGRyPTE5Mi4xNjguMjU1LjI1DQo+IDAgMA0KPiANCj4gbmZzc3Rh dCAtZSAtcywgd2l0aCBwYXRjaGVzLCBhZnRlciBzb21lIHN0cmVzcyB0ZXN0aW5nOg0KPiBTZXJ2 ZXIgSW5mbzoNCj4gR2V0YXR0ciBTZXRhdHRyIExvb2t1cCBSZWFkbGluayBSZWFkIFdyaXRlIENy ZWF0ZSBSZW1vdmUNCj4gOTUzMzQgMSAyODAwNCA1MCAyOTcxMjUgMiAwIDANCj4gUmVuYW1lIExp bmsgU3ltbGluayBNa2RpciBSbWRpciBSZWFkZGlyIFJkaXJQbHVzIEFjY2Vzcw0KPiAwIDAgMCAw IDAgMTI0MiAwIDE0NDQNCj4gTWtub2QgRnNzdGF0IEZzaW5mbyBQYXRoQ29uZiBDb21taXQgTG9v a3VwUCBTZXRDbElkIFNldENsSWRDZg0KPiAwIDAgMCAwIDIgMCA0IDQNCj4gT3BlbiBPcGVuQXR0 ciBPcGVuRHduR3IgT3BlbkNmcm0gRGVsZVB1cmdlIERlbGVSZXQgR2V0RkggTG9jaw0KPiAxNzY3 MzUgMCAwIDIxMTc1IDAgMCA0OTE3MSAwDQo+IExvY2tUIExvY2tVIENsb3NlIFZlcmlmeSBOVmVy aWZ5IFB1dEZIIFB1dFB1YkZIIFB1dFJvb3RGSA0KPiAwIDAgMjExODQgMCAwIDU0OTg1MyAwIDE3 DQo+IFJlbmV3IFJlc3RvcmVGSCBTYXZlRkggU2VjaW5mbyBSZWxMY2tPd24gVjRDcmVhdGUNCj4g MCAyMTE4NiAxNzY3MzUgMCAwIDANCj4gU2VydmVyOg0KPiBSZXRmYWlsZWQgRmF1bHRzIENsaWVu dHMNCj4gMCAwIDENCj4gT3Blbk93bmVyIE9wZW5zIExvY2tPd25lciBMb2NrcyBEZWxlZ3MNCj4g MjkxIDIgMCAwIDANCj4gU2VydmVyIENhY2hlIFN0YXRzOg0KPiBJbnByb2cgSWRlbSBOb24taWRl bSBNaXNzZXMgQ2FjaGVTaXplIFRDUFBlYWsNCj4gMCAwIDAgNTQ5OTY5IDI5MSAyODI3DQo+IA0K WWVzLCB0aGVzZSBzdGF0cyBsb29rIHJlYXNvbmFibGUuDQooYW5kIHNvcnJ5IGlmIHRoZSBtYWls IHN5c3RlbSBJIHVzZSBtdW5nZWQgdGhlIHdoaXRlc3BhY2UpDQoNCj4gbmZzc3RhdCAtZSAtcywg cHJpb3IgdG8gcGF0Y2hlcywgZ2VuZXJhbCB1c2FnZToNCj4gDQo+IFNlcnZlciBJbmZvOg0KPiBH ZXRhdHRyIFNldGF0dHIgTG9va3VwIFJlYWRsaW5rIFJlYWQgV3JpdGUgQ3JlYXRlIFJlbW92ZQ0K PiAyODEzNDc3IDYyNjYxIDM4MjYzNiAxNDE5IDgzNzQ5MiAyMTE1NDIyIDAgMzM5NzYgUmVuYW1l IExpbmsgU3ltbGluayANCj4gTWtkaXIgUm1kaXIgUmVhZGRpciBSZGlyUGx1cyBBY2Nlc3MNCj4g MzExNjQgMTMxMCAwIDAgMCAxNTY3OCAxMCAzMDcyMzYNCj4gTWtub2QgRnNzdGF0IEZzaW5mbyBQ YXRoQ29uZiBDb21taXQgTG9va3VwUCBTZXRDbElkIFNldENsSWRDZg0KPiAwIDAgMiAxIDE0NDU1 MCAwIDQzIDQzDQo+IE9wZW4gT3BlbkF0dHIgT3BlbkR3bkdyIE9wZW5DZnJtIERlbGVQdXJnZSBE ZWxlUmV0IEdldEZIIExvY2sNCj4gMTQ2MjU5NSAwIDU5NSAxMTI2NyAwIDAgNTUwNzYxIDI4MDY3 NA0KPiBMb2NrVCBMb2NrVSBDbG9zZSBWZXJpZnkgTlZlcmlmeSBQdXRGSCBQdXRQdWJGSCBQdXRS b290RkgNCj4gMTU1IDIxMjI5OSAyODY2MTUgMCAwIDY2NTEwMDYgMCAxMjM0DQo+IFJlbmV3IFJl c3RvcmVGSCBTYXZlRkggU2VjaW5mbyBSZWxMY2tPd24gVjRDcmVhdGUNCj4gMjU2Nzg0IDMyMDc2 MSAxNDk1ODA1IDAgMCA3MzgNCj4gU2VydmVyOg0KPiBSZXRmYWlsZWQgRmF1bHRzIENsaWVudHMN Cj4gMCAwIDMNCj4gT3Blbk93bmVyIE9wZW5zIExvY2tPd25lciBMb2NrcyBEZWxlZ3MNCj4gNiAx NzggODAxMiAyIDANCj4gU2VydmVyIENhY2hlIFN0YXRzOg0KPiBJbnByb2cgSWRlbSBOb24taWRl bSBNaXNzZXMgQ2FjaGVTaXplIFRDUFBlYWsNCj4gMCAwIDk2IDY4NzY2MTAgODA4NCAxMzQyOQ0K PiANCkhtbS4gTG9ja093bmVycyBoYXZlIHRoZSBzYW1lIHByb3BlcnR5IGFzIE9wZW5Pd25lcnMg aW4gdGhhdCB0aGUgc2VydmVyIGlzIHJlcXVpcmVkIHRvIGhvbGQgb250byB0aGUgbGFzdCByZXBs eSBpbiB0aGUgY2FjaGUgdW50aWwgdGhlIE9wZW4vTG9jayBPd25lciBpcyByZWxlYXNlZC4gVW5m b3J0dW5hdGVseSwgYSBzZXJ2ZXIgY2FuJ3QgcmVsZWFzZSBhIExvY2tPd25lciB1bnRpbCBlaXRo ZXIgdGhlIGNsaWVudCBpc3N1ZXMgYSBSZWxlYXNlTG9ja093bmVyIG9wZXJhdGlvbiB0byB0ZWxs IHRoZSBzZXJ2ZXIgdGhhdCBpdCB3aWxsIG5vIGxvbmdlciB1c2UgdGhlIExvY2tPd25lciBvciB0 aGUgb3BlbiBpcyBjbG9zZWQuDQoNClRoZXNlIHN0YXRzIHN1Z2dlc3QgdGhhdCB0aGUgY2xpZW50 IHRyaWVkIHRvIGRvIGJ5dGUgcmFuZ2UgbG9ja2luZyBvdmVyIDgwMDAgdGltZXMgd2l0aCBkaWZm ZXJlbnQgTG9ja093bmVycyAoSSBkb24ndCBrbm93IGhvdyB0aGUgTGludXggY2xpZW50IGRlY2lk ZWQgdG8gdXNlIGEgZGlmZmVyZW50IExvY2tPd25lcj8pLCBmb3IgZmlsZShzKSB0aGF0IHdlcmUg c3RpbGwgb3Blbi4gKFdoZW4gSSB0ZXN0IHVzaW5nIHRoZSBGZWRvcmExNSBjbGllbnQsIEkgZG8g c2VlIFJlbGVhc2VMb2NrT3duZXIgb3BlcmF0aW9ucywgYnV0IHVzdWFsbHkganVzdCBiZWZvcmUg YSBjbG9zZS4gSSBkb24ndCBrbm93IGhvdyByZWNlbnRseSB0aGF0IHdhcyBhZGRlZCB0byB0aGUg TGludXggY2xpZW50LiBSZWxlYXNlTG9ja093bmVyIHdhcyBhZGRlZCBqdXN0IGJlZm9yZSB0aGUg UkZDIHdhcyBwdWJsaXNoZWQgdG8gdHJ5IGFuZCBkZWFsIHdpdGggYSBzaXR1YXRpb24gd2hlcmUg dGhlIGNsaWVudCB1c2VzIGEgbG90IG9mIExvY2tPd25lcnMgdGhhdCB0aGUgc2VydmVyIG11c3Qg aG9sZCBvbnRvIHVudGlsIHRoZSBmaWxlIGlzIGNsb3NlZC4NCg0KSWYgdGhpcyBpcyBsZWdpdGlt YXRlLCBhbGwgdGhhdCBjYW4gYmUgZG9uZSBpcyBpbmNyZWFzZSBORlNSVkNBQ0hFX0ZMT09ETEVW RUwgYW5kIGhvcGUgdGhhdCB5b3UgY2FuIGZpbmQgYSB2YWx1ZSBsYXJnZSBlbm91Z2ggdGhhdCB0 aGUgY2xpZW50cyBkb24ndCBidW1wIGludG8gaXQgd2l0aG91dCBleGhhdXN0aW5nIG1idWZzLiAo SSdkIGluY3JlYXNlICJrZXJuLmlwYy5ubWJjbHVzdGVycyIgdG8gc29tZXRoaW5nIGxhcmdlciB0 aGFuIHdoYXQgeW91IHNldCBORlNSVkNBQ0hFX0ZMT09ETEVFVkVMIHRvLikNCg0KSG93ZXZlciwg SSBzdXNwZWN0IHRoZSA4MDg0IExvY2tPd25lcnMgaXMgYSByZXN1bHQgb2Ygc29tZSBvdGhlciBw cm9ibGVtLiBGaW5nZXJzIGFuZCB0b2VzIGNyb3NzZWQgdGhhdCBpdCB3YXMgYSBzaWRlIGVmZmVj dCBvZiB0aGUgY2FjaGUgU01QIGJ1Z3MgZml4ZWQgYnkgY2FjaGUucGF0Y2guIChub29wZW4ucGF0 Y2ggd29uJ3QgaGVscCBmb3IgdGhpcyBjYXNlLCBiZWNhdXNlIGl0IGFwcGVhcnMgdG8gYmUgbG9j a293bmVycyBhbmQgbm90IG9wZW5vd25lcnMgdGhhdCBhcmUgaG9sZGluZyB0aGUgY2FjaGVkIGVu dHJpZXMsIGJ1dCBpdCB3b24ndCBkbyBhbnkgaGFybSwgZWl0aGVyLikNCg0KSWYgeW91IHNlZSB2 ZXJ5IGxhcmdlIExvY2tPd25lciBjb3VudHMgYWdhaW4sIHdpdGggdGhlIHBhdGNoZWQga2VybmVs LCBhbGwgSSBjYW4gc3VnZ2VzdCBpcyBkb2luZyBhIHBhY2tldCBjYXB0dXJlIGFuZCBlbWFpbGlu ZyBpdCB0byBtZS4gInRjcGR1bXAgLXMgMCAtdyB4eHgiIHJ1biBmb3IgYSBzaG9ydCBlbm91Z2gg dGltZSB0aGF0ICJ4eHgiIGlzbid0IGh1Z2Ugd2hlbiBydW4gb24gdGhlIHNlcnZlciBtaWdodCBj YXRjaCBzb21lIGlzc3VlIChsaWtlIHRoZSBjbGllbnQgcmV0cnlpbmcgYSBsb2NrIG92ZXIgYW5k IG92ZXIgYW5kIG92ZXIgYWdhaW4pLiBBIHBhY2tldCBjYXB0dXJlIG1pZ2h0IGFsc28gc2hvdyBp ZiB0aGUgVWJ1bnR1IGNsaWVudCBpcyBkb2luZyBSZWxlYXNlTG9ja093bmVyIG9wZXJhdGlvbnMu IChCdHcsIHlvdSBjYW4gbG9vayBhdCB0aGUgdHJhY2UgdXNpbmcgd2lyZXNoYXJrLCB3aGljaCBr bm93cyBhYm91dCBORlN2NC4pDQoNCkluIHN1bW1hcnksIEl0J2xsIGJlIGludGVyZXN0aW5nIHRv IHNlZSBob3cgdGhpcyBnb2VzLCByaWNrDQpwczogU29ycnkgYWJvdXQgdGhlIGxvbmcgd2luZGVk IHJlcGx5LCBidXQgdGhpcyBpcyBuZnN2NCBhZnRlciBhbGw6LSkNCg0KX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCmZyZWVic2QtZnNAZnJlZWJzZC5vcmcg bWFpbGluZyBsaXN0DQpodHRwOi8vbGlzdHMuZnJlZWJzZC5vcmcvbWFpbG1hbi9saXN0aW5mby9m cmVlYnNkLWZzDQpUbyB1bnN1YnNjcmliZSwgc2VuZCBhbnkgbWFpbCB0byAiZnJlZWJzZC1mcy11 bnN1YnNjcmliZUBmcmVlYnNkLm9yZyINCg== From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 19:13:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F3931065670 for ; Wed, 20 Jul 2011 19:13:48 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 1CFF78FC0A for ; Wed, 20 Jul 2011 19:13:47 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap8EAL0nJ06DaFvO/2dsb2JhbABKCRuEL6QUtxyRCIErggOCAIEPBJJuiDCISQ X-IronPort-AV: E=Sophos;i="4.67,236,1309752000"; d="scan'208";a="127910055" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 20 Jul 2011 15:13:46 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id BAFB8B3F20; Wed, 20 Jul 2011 15:13:46 -0400 (EDT) Date: Wed, 20 Jul 2011 15:13:46 -0400 (EDT) From: Rick Macklem To: Clinton Adams Message-ID: <1487604530.809805.1311189226746.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: FreeBSD FS Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 19:13:48 -0000 Clinton Adams wrote: > On Wed, Jul 20, 2011 at 3:29 PM, Rick Macklem > wrote: > > Clinton Adams wrote: > >> On Wed, Jul 20, 2011 at 1:09 AM, Rick Macklem > >> > >> wrote: > >> > Please try the patch, which is at: > >> > =C2=A0 http://people.freebsd.org/~rmacklem/noopen.patch > >> > (This patch is against the file in -current, so patch may not > >> > like > >> > it, but > >> > =C2=A0it should be easy to do by hand, if patch fails.) > >> > > >> > Again, good luck with it and please let me know how it goes, rick > >> > > >> > >> Thanks for your help with this, trying the patches now. Tests with > >> one > >> client look good so far, values for OpenOwner and CacheSize are > >> more > >> in line, we'll test with more clients later today. We were hitting > >> the > >> nfsrc_floodlevel with just three clients before, all using nfs4 > >> mounted home and other directories. Clients are running Ubuntu > >> 10.04.2 > >> LTS. Usage has been general desktop usage, nothing unusual that we > >> could see. > >> > >> Relevant snippet of /proc/mounts on client (rsize,wsize are being > >> automatically negotiated, not specified in the automount options): > >> pez.votesmart.org:/public /export/public nfs4 > >> rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,pro= to=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,mi= norversion=3D0,addr=3D192.168.255.25 > >> 0 0 > >> pez.votesmart.org:/home/clinton /home/clinton nfs4 > >> rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,pro= to=3Dtcp,timeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,mi= norversion=3D0,addr=3D192.168.255.25 > >> 0 0 > >> > >> nfsstat -e -s, with patches, after some stress testing: > >> Server Info: > >> Getattr Setattr Lookup Readlink Read Write Create Remove > >> 95334 1 28004 50 297125 2 0 0 > >> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > >> 0 0 0 0 0 1242 0 1444 > >> Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > >> 0 0 0 0 2 0 4 4 > >> Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > >> 176735 0 0 21175 0 0 49171 0 > >> LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > >> 0 0 21184 0 0 549853 0 17 > >> Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > >> 0 21186 176735 0 0 0 > >> Server: > >> Retfailed Faults Clients > >> 0 0 1 > >> OpenOwner Opens LockOwner Locks Delegs > >> 291 2 0 0 0 > >> Server Cache Stats: > >> Inprog Idem Non-idem Misses CacheSize TCPPeak > >> 0 0 0 549969 291 2827 > >> > > Yes, these stats look reasonable. > > (and sorry if the mail system I use munged the whitespace) > > > >> nfsstat -e -s, prior to patches, general usage: > >> > >> Server Info: > >> Getattr Setattr Lookup Readlink Read Write Create Remove > >> 2813477 62661 382636 1419 837492 2115422 0 33976 > >> Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > >> 31164 1310 0 0 0 15678 10 307236 > >> Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > >> 0 0 2 1 144550 0 43 43 > >> Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > >> 1462595 0 595 11267 0 0 550761 280674 > >> LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > >> 155 212299 286615 0 0 6651006 0 1234 > >> Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > >> 256784 320761 1495805 0 0 738 > >> Server: > >> Retfailed Faults Clients > >> 0 0 3 > >> OpenOwner Opens LockOwner Locks Delegs > >> 6 178 8012 2 0 > >> Server Cache Stats: > >> Inprog Idem Non-idem Misses CacheSize TCPPeak > >> 0 0 96 6876610 8084 13429 > >> > > Hmm. LockOwners have the same property as OpenOwners in that the > > server is required to hold onto the last reply in the cache until > > the Open/Lock Owner is released. Unfortunately, a server can't > > release a LockOwner until either the client issues a > > ReleaseLockOwner > > operation to tell the server that it will no longer use the > > LockOwner > > or the open is closed. > > > > These stats suggest that the client tried to do byte range locking > > over 8000 times with different LockOwners (I don't know how the > > Linux > > client decided to use a different LockOwner?), for file(s) that were > > still open. (When I test using the Fedora15 client, I do see > > ReleaseLockOwner operations, but usually just before a close. I > > don't > > know how recently that was added to the Linux client. > > ReleaseLockOwner > > was added just before the RFC was published to try and deal with a > > situation where the client uses a lot of LockOwners that the server > > must > > hold onto until the file is closed. > > > > If this is legitimate, all that can be done is increase > > NFSRVCACHE_FLOODLEVEL and hope that you can find a value large > > enough > > that the clients don't bump into it without exhausting mbufs. (I'd > > increase "kern.ipc.nmbclusters" to something larger than what you > > set NFSRVCACHE_FLOODLEEVEL to.) > > > > However, I suspect the 8084 LockOwners is a result of some other > > problem. Fingers and toes crossed that it was a side effect of the > > cache SMP bugs fixed by cache.patch. (noopen.patch won't help for > > this case, because it appears to be lockowners and not openowners > > that are holding the cached entries, but it won't do any harm, > > either.) > > > > If you see very large LockOwner counts again, with the patched > > kernel, all I can suggest is doing a packet capture and emailing > > it to me. "tcpdump -s 0 -w xxx" run for a short enough time > > that "xxx" isn't huge when run on the server > > might catch some issue (like the client retrying a lock over and > > over > > and over again). A packet capture might also show if the Ubuntu > > client > > is doing ReleaseLockOwner operations. (Btw, you can look at the > > trace > > using wireshark, which knows about NFSv4.) >=20 > Running four clients now and the LockOwners are steadily climbing, > nfsstat consistently reported it as 0 prior to users logging into the > nfsv4 test systems - my testing via ssh didn't show anything like > this. Attached tcpdump file is from when I first noticed the jump in > LockOwners from 0 to ~600. I tried wireshark on this and didn't see > any releaselockowner operations. >=20 > Server Info: > Getattr Setattr Lookup Readlink Read Write Create Remove > 1226807 47083 54617 175 1128558 806036 0 695 > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > 606 72 0 0 0 3189 0 13848 > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > 0 0 0 0 7645 0 9 9 > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > 246079 0 22 73672 0 0 141287 7076 > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > 10 6218 89443 0 0 2516897 0 1836 > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > 0 90421 246804 0 0 47 > Server: > Retfailed Faults Clients > 0 0 4 > OpenOwner Opens LockOwner Locks Delegs > 6 242 2481 22 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 2 2518251 2502 4772 >=20 > Thanks again for your help on this >=20 Well, I looked at the packet trace and I'm afraid what I see is pretty well a worst case scenario for an NFSv4.0 server. At line #261 and #309, the client does a LOCK op with a different new lockowner for the same open/file. I see a few opens/closes, but no close for this open and no ReleaseLockOwner op. I suspect the file (fh CRC 0x1091fd96) is being kept open (until the user logs out?) and every now and again, a fresh lock_owner is created, followed by a few lock ops. If you look at the Lock ops at packets #261 and #309, you'll notice the sequence# as a56 and a57 respectively. This indicates that 2647 operations like these Locks have been done in the Open. Bottom line... The server can't throw the lock_owners away until the client closes the file and it looks like the # of lock_owners (each with a cached reply, which is also required by the protocol) is just gonna keep growing. I think you'll either need to figure out a way to get the file closed (user logging out and then back in, maybe?) and then increase the flood level enough that the users don't hit it. OR switch to a NFSv3 mount. rick ps: I hope you didn't mind me putting the mailing list back on the cc. From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 19:43:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 84255106564A for ; Wed, 20 Jul 2011 19:43:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id EA2708FC14 for ; Wed, 20 Jul 2011 19:43:35 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: AtYAAPkuJ06DaFvO/2dsb2JhbABHAwkbhC+TUpBCiHytX5EIgSuBewiCAIEPBJBjgguIMIhJIA X-IronPort-AV: E=Sophos;i="4.67,237,1309752000"; d="scan'208";a="127914402" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 20 Jul 2011 15:43:34 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 064D4B3F0E; Wed, 20 Jul 2011 15:43:35 -0400 (EDT) Date: Wed, 20 Jul 2011 15:43:35 -0400 (EDT) From: Rick Macklem To: Zack Kirsch Message-ID: <1131900815.812133.1311191015011.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <476FC2247D6C7843A4814ED64344560C04443EAA@seaxch10.desktop.isilon.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 19:43:36 -0000 Zack Kirsch wrote: > Just wanted to add a bit of Isilon color. We've hit this limit before, > but I believe it was mostly due to strange client behavior of 1) Using > a new lockowner for each lock This is what appears to be happening from the packet trace they sent me. The client uses a new lock owner to do a few locks and then, later, repeats the same cycle for the same file without closing the associated Open. There also doesn't appear to be any ReleaseLockOwner operations being done. Unfor= tnately the protocol requires that the lock_owners (and the cached last reply for e= ach of them) be held until the close or releaselockowner. Btw, from what I've seen, both the Linux and Solaris10 clients use new lock= owners frequently and the FreeBSD client somewhat less so. A recent change to the = FreeBSD client makes it use more lock_owners, as required by a server that won't al= low re-use of the same lock_owner string for multiple files being locked concur= rently. (The RFC doesn't necessarily require a server to support multiple instances= of a lock_owner string, so the server seems to be within the specification.) > and 2) Using a new TCP connection for > each 'test run'. As I understand the protocol "same vs new TCP connection" isn't relevant. T= he state is tied to a ClientID and not TCP connection. If the client does a fr= esh SetClientID/SetClientIDConfirm after a new mount, that tells the server tha= t it can throw the old state (and cached replies) away. (As an historical aside,= in the early days of NFSv4 development, I suggested tying state to TCP connect= ions, but that got shot down.) > As far as I know, we haven't hit this in the field. >=20 First time I've seen it, although my understanding is that this was why the ReleaseLockOwner operation was added near the end of the NFSv4.0 RFC's development. (In case anyone is interested, the NFSv4.1 protocol avoids thi= s problem by using a mechanism called Sessions that sets a fixed upper bound on the reply cache--> one reply per slot, where the # of slots is fixed whe= n the session is set up.) > We've done a few things to combat this problem: > 1) We increased the floodlevel to 65536. > 2) We made the floodlevel configurable via sysctl. I've thought that it would be nice to define this as a fraction of what kern.ipc.nmbclusters is set to, but I haven't looked to see how often an mbuf cluster ends up being a part of the cached reply. The 16K was just a very conservative # chosen when the server I did load tests against had 512Mbytes of RAM. I think tying it to kern.ipc.nmbclusters (or directly to the machine's RAM size or both??) would be nice. Having yet another tunable few understan= d (ie. making it a sysctl) seems a less desirable fallback plan? > 3) We made significant changes to the replay cache itself. Specific > gains were drastic performance improvements and freeing of cache > entries from stale TCP connections. >=20 I haven't seen the patch and may be misinterpreting this, but the idea behind the cache is to hang onto the replies for "stale TCP connections", because those are the requests that may get retried after a client reconnec= ts via TCP. Anyhow, I'll see what you mean when I see the patches. rick > I'd like to upstream all of this, but it will take some time, and > obviously won't happen until stable9 branches. >=20 > Zack >=20 > -----Original Message----- > From: owner-freebsd-fs@freebsd.org > [mailto:owner-freebsd-fs@freebsd.org] On Behalf Of Rick Macklem > Sent: Wednesday, July 20, 2011 6:30 AM > To: Clinton Adams > Cc: freebsd-fs@freebsd.org > Subject: Re: nfsd server cache flooded, try to increase > nfsrc_floodlevel >=20 > Clinton Adams wrote: > > On Wed, Jul 20, 2011 at 1:09 AM, Rick Macklem > > wrote: > > > Please try the patch, which is at: > > > =C2=A0 http://people.freebsd.org/~rmacklem/noopen.patch > > > (This patch is against the file in -current, so patch may not like > > > it, but > > > =C2=A0it should be easy to do by hand, if patch fails.) > > > > > > Again, good luck with it and please let me know how it goes, rick > > > > > > > Thanks for your help with this, trying the patches now. Tests with > > one > > client look good so far, values for OpenOwner and CacheSize are more > > in line, we'll test with more clients later today. We were hitting > > the > > nfsrc_floodlevel with just three clients before, all using nfs4 > > mounted home and other directories. Clients are running Ubuntu > > 10.04.2 > > LTS. Usage has been general desktop usage, nothing unusual that we > > could see. > > > > Relevant snippet of /proc/mounts on client (rsize,wsize are being > > automatically negotiated, not specified in the automount options): > > pez.votesmart.org:/public /export/public nfs4 > > rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,prot= o=3Dtcp,t > > imeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,minorver= sion=3D0, > > addr=3D192.168.255.25 > > 0 0 > > pez.votesmart.org:/home/clinton /home/clinton nfs4 > > rw,relatime,vers=3D4,rsize=3D65536,wsize=3D65536,namlen=3D255,hard,prot= o=3Dtcp,t > > imeo=3D600,retrans=3D2,sec=3Dkrb5,clientaddr=3D192.168.255.112,minorver= sion=3D0, > > addr=3D192.168.255.25 > > 0 0 > > > > nfsstat -e -s, with patches, after some stress testing: > > Server Info: > > Getattr Setattr Lookup Readlink Read Write Create Remove > > 95334 1 28004 50 297125 2 0 0 > > Rename Link Symlink Mkdir Rmdir Readdir RdirPlus Access > > 0 0 0 0 0 1242 0 1444 > > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > > 0 0 0 0 2 0 4 4 > > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > > 176735 0 0 21175 0 0 49171 0 > > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > > 0 0 21184 0 0 549853 0 17 > > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > > 0 21186 176735 0 0 0 > > Server: > > Retfailed Faults Clients > > 0 0 1 > > OpenOwner Opens LockOwner Locks Delegs > > 291 2 0 0 0 > > Server Cache Stats: > > Inprog Idem Non-idem Misses CacheSize TCPPeak > > 0 0 0 549969 291 2827 > > > Yes, these stats look reasonable. > (and sorry if the mail system I use munged the whitespace) >=20 > > nfsstat -e -s, prior to patches, general usage: > > > > Server Info: > > Getattr Setattr Lookup Readlink Read Write Create Remove > > 2813477 62661 382636 1419 837492 2115422 0 33976 Rename Link Symlink > > Mkdir Rmdir Readdir RdirPlus Access > > 31164 1310 0 0 0 15678 10 307236 > > Mknod Fsstat Fsinfo PathConf Commit LookupP SetClId SetClIdCf > > 0 0 2 1 144550 0 43 43 > > Open OpenAttr OpenDwnGr OpenCfrm DelePurge DeleRet GetFH Lock > > 1462595 0 595 11267 0 0 550761 280674 > > LockT LockU Close Verify NVerify PutFH PutPubFH PutRootFH > > 155 212299 286615 0 0 6651006 0 1234 > > Renew RestoreFH SaveFH Secinfo RelLckOwn V4Create > > 256784 320761 1495805 0 0 738 > > Server: > > Retfailed Faults Clients > > 0 0 3 > > OpenOwner Opens LockOwner Locks Delegs > > 6 178 8012 2 0 > > Server Cache Stats: > > Inprog Idem Non-idem Misses CacheSize TCPPeak > > 0 0 96 6876610 8084 13429 > > > Hmm. LockOwners have the same property as OpenOwners in that the > server is required to hold onto the last reply in the cache until the > Open/Lock Owner is released. Unfortunately, a server can't release a > LockOwner until either the client issues a ReleaseLockOwner operation > to tell the server that it will no longer use the LockOwner or the > open is closed. >=20 > These stats suggest that the client tried to do byte range locking > over 8000 times with different LockOwners (I don't know how the Linux > client decided to use a different LockOwner?), for file(s) that were > still open. (When I test using the Fedora15 client, I do see > ReleaseLockOwner operations, but usually just before a close. I don't > know how recently that was added to the Linux client. ReleaseLockOwner > was added just before the RFC was published to try and deal with a > situation where the client uses a lot of LockOwners that the server > must hold onto until the file is closed. >=20 > If this is legitimate, all that can be done is increase > NFSRVCACHE_FLOODLEVEL and hope that you can find a value large enough > that the clients don't bump into it without exhausting mbufs. (I'd > increase "kern.ipc.nmbclusters" to something larger than what you set > NFSRVCACHE_FLOODLEEVEL to.) >=20 > However, I suspect the 8084 LockOwners is a result of some other > problem. Fingers and toes crossed that it was a side effect of the > cache SMP bugs fixed by cache.patch. (noopen.patch won't help for this > case, because it appears to be lockowners and not openowners that are > holding the cached entries, but it won't do any harm, either.) >=20 > If you see very large LockOwner counts again, with the patched kernel, > all I can suggest is doing a packet capture and emailing it to me. > "tcpdump -s 0 -w xxx" run for a short enough time that "xxx" isn't > huge when run on the server might catch some issue (like the client > retrying a lock over and over and over again). A packet capture might > also show if the Ubuntu client is doing ReleaseLockOwner operations. > (Btw, you can look at the trace using wireshark, which knows about > NFSv4.) >=20 > In summary, It'll be interesting to see how this goes, rick > ps: Sorry about the long winded reply, but this is nfsv4 after all:-) >=20 > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Wed Jul 20 20:20:10 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 352AA1065674 for ; Wed, 20 Jul 2011 20:20:10 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id E6C748FC08 for ; Wed, 20 Jul 2011 20:20:09 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap0EACE3J06DaFvO/2dsb2JhbABThEqkFLYQkQSBK4QDgQ8Ekm6IMIhJ X-IronPort-AV: E=Sophos;i="4.67,237,1309752000"; d="scan'208";a="131745726" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 20 Jul 2011 16:20:09 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 0E137B3F21; Wed, 20 Jul 2011 16:20:09 -0400 (EDT) Date: Wed, 20 Jul 2011 16:20:09 -0400 (EDT) From: Rick Macklem To: Zack Kirsch Message-ID: <1443258176.814644.1311193209047.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <1131900815.812133.1311191015011.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Jul 2011 20:20:10 -0000 It's me again: > Zack Kirsch wrote: [good stuff snipped for brevity] > > > We've done a few things to combat this problem: > > 1) We increased the floodlevel to 65536. > > 2) We made the floodlevel configurable via sysctl. > I've thought that it would be nice to define this as a fraction of > what kern.ipc.nmbclusters is set to, but I haven't looked to see how > often an mbuf cluster ends up being a part of the cached reply. > > The 16K was just a very conservative # chosen when the server I did > load tests against had 512Mbytes of RAM. > > I think tying it to kern.ipc.nmbclusters (or directly to the machine's > RAM size or both??) would be nice. Having yet another tunable few > understand > (ie. making it a sysctl) seems a less desirable fallback plan? > I just did a quick test and it seems that the replies cached for these open_owners (and lock_owners too, I think) are usually just one mbuf, so cranking the flood level way up shouldn't be too bad. Can anyone suggest what would be an appropriate upper limit, given that each cached entry will use one small malloc'd data structure plus one mbuf (without a cluster)? rick From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 04:16:32 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6B853106564A for ; Thu, 21 Jul 2011 04:16:32 +0000 (UTC) (envelope-from freebsd@deman.com) Received: from plato.corp.nas.com (plato.corp.nas.com [66.114.32.138]) by mx1.freebsd.org (Postfix) with ESMTP id 28A3E8FC0C for ; Thu, 21 Jul 2011 04:16:31 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by plato.corp.nas.com (Postfix) with ESMTP id 55390E997E63 for ; Wed, 20 Jul 2011 20:59:35 -0700 (PDT) X-Virus-Scanned: amavisd-new at corp.nas.com Received: from plato.corp.nas.com ([127.0.0.1]) by localhost (plato.corp.nas.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id h-KVQWfdovw6 for ; Wed, 20 Jul 2011 20:59:34 -0700 (PDT) Received: from [192.168.1.118] (c-67-170-191-30.hsd1.wa.comcast.net [67.170.191.30]) by plato.corp.nas.com (Postfix) with ESMTPSA id D3A0DE997E58 for ; Wed, 20 Jul 2011 20:59:33 -0700 (PDT) From: Michael DeMan Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Wed, 20 Jul 2011 20:59:32 -0700 Message-Id: <188D255F-B83A-4B9A-89AF-9BF58050F816@deman.com> To: freebsd-fs@freebsd.org Mime-Version: 1.0 (Apple Message framework v1084) X-Mailer: Apple Mail (2.1084) Subject: Marvell 88SX6081 timeouts, particularly when running 'zfs scrub' with regular I/O X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 04:16:32 -0000 Hi All, I've found a few posts around about this, but nothing conclusive. We have been getting hit on this with two... mvs0: port 0x9400-0x94ff mem = 0xfc400000-0xfc4fffff irq 28 at device 1.0 on pci1 mvs1: port 0x9800-0x98ff mem = 0xfc500000-0xfc5fffff irq 29 at device 3.0 on pci1 ...controllers. I went through and did a few things (an older Opteron 285 box) and = disabled super-pages and permutations on other device.hints, loader.conf = and live sysctl settings - all to no avail. I also found a few things via Google about being to patch from = 9-CURRENT, but the idea with this box was to be able to re-purpose some = older equipment for proof of concept using FreeNAS8. It is possible for me to build a version of that with the patches, etc - = but I figured it would be better to post to the list first and gather = feedback since this is pretty old/clunky hardware and newer patches may = or may not solve the problem. Thanks, - mike deman Jul 19 16:46:41 freenas kernel: mvsch11: Timeout on slot 0 Jul 19 16:46:41 freenas kernel: mvsch11: iec 02000000 sstat 00000123 = serr 00000000 edma_s 00001023 dma_c 00000000 dma_s 00000000 rs 00000201 = status 40 Jul 19 16:46:41 freenas kernel: mvsch11: ... waiting for slots 00000200 Jul 19 16:46:43 freenas kernel: mvsch4: Timeout on slot 4 Jul 19 16:46:43 freenas kernel: mvsch4: iec 02000000 sstat 00000123 serr = 00000000 edma_s 00001022 dma_c 00000000 dma_s 00000000 rs 00000010 = status 40 Jul 19 16:46:45 freenas kernel: mvsch1: Timeout on slot 0 Jul 19 16:46:45 freenas kernel: mvsch1: iec 02000000 sstat 00000123 serr = 00000000 edma_s 00001101 dma_c 00000000 dma_s 00000000 rs 00000001 = status 40 Jul 19 16:46:46 freenas kernel: mvsch11: Timeout on slot 9 Jul 19 16:46:46 freenas kernel: mvsch11: iec 02000000 sstat 00000123 = serr 00000000 edma_s 00001023 dma_c 00000000 dma_s 00000000 rs 00000201 = status 40 Jul 19 16:46:47 freenas root: ZFS: checksum mismatch, zpool=3Dzpmir1 = path=3D/dev/gpt/ada12 offset=3D52341838336 size=3D131072 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 04:40:42 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0E20D106566C for ; Thu, 21 Jul 2011 04:40:42 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta01.emeryville.ca.mail.comcast.net (qmta01.emeryville.ca.mail.comcast.net [76.96.30.16]) by mx1.freebsd.org (Postfix) with ESMTP id EA1E88FC14 for ; Thu, 21 Jul 2011 04:40:41 +0000 (UTC) Received: from omta13.emeryville.ca.mail.comcast.net ([76.96.30.52]) by qmta01.emeryville.ca.mail.comcast.net with comcast id AUgb1h00517UAYkA1Ugedv; Thu, 21 Jul 2011 04:40:38 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta13.emeryville.ca.mail.comcast.net with comcast id AUgf1h00X1t3BNj8ZUggCR; Thu, 21 Jul 2011 04:40:40 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 71BB1102C36; Wed, 20 Jul 2011 21:40:38 -0700 (PDT) Date: Wed, 20 Jul 2011 21:40:38 -0700 From: Jeremy Chadwick To: Michael DeMan Message-ID: <20110721044038.GA57436@icarus.home.lan> References: <188D255F-B83A-4B9A-89AF-9BF58050F816@deman.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <188D255F-B83A-4B9A-89AF-9BF58050F816@deman.com> User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: Marvell 88SX6081 timeouts, particularly when running 'zfs scrub' with regular I/O X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 04:40:42 -0000 On Wed, Jul 20, 2011 at 08:59:32PM -0700, Michael DeMan wrote: > I've found a few posts around about this, but nothing conclusive. > > We have been getting hit on this with two... > mvs0: port 0x9400-0x94ff mem 0xfc400000-0xfc4fffff irq 28 at device 1.0 on pci1 > mvs1: port 0x9800-0x98ff mem 0xfc500000-0xfc5fffff irq 29 at device 3.0 on pci1 > ...controllers. > > I went through and did a few things (an older Opteron 285 box) and disabled super-pages and permutations on other device.hints, loader.conf and live sysctl settings - all to no avail. > > I also found a few things via Google about being to patch from 9-CURRENT, but the idea with this box was to be able to re-purpose some older equipment for proof of concept using FreeNAS8. > > It is possible for me to build a version of that with the patches, etc - but I figured it would be better to post to the list first and gather feedback since this is pretty old/clunky hardware and newer patches may or may not solve the problem. > > Thanks, > > - mike deman > > > > Jul 19 16:46:41 freenas kernel: mvsch11: Timeout on slot 0 > Jul 19 16:46:41 freenas kernel: mvsch11: iec 02000000 sstat 00000123 serr 00000000 edma_s 00001023 dma_c 00000000 dma_s 00000000 rs 00000201 status 40 > Jul 19 16:46:41 freenas kernel: mvsch11: ... waiting for slots 00000200 > Jul 19 16:46:43 freenas kernel: mvsch4: Timeout on slot 4 > Jul 19 16:46:43 freenas kernel: mvsch4: iec 02000000 sstat 00000123 serr 00000000 edma_s 00001022 dma_c 00000000 dma_s 00000000 rs 00000010 status 40 > Jul 19 16:46:45 freenas kernel: mvsch1: Timeout on slot 0 > Jul 19 16:46:45 freenas kernel: mvsch1: iec 02000000 sstat 00000123 serr 00000000 edma_s 00001101 dma_c 00000000 dma_s 00000000 rs 00000001 status 40 > Jul 19 16:46:46 freenas kernel: mvsch11: Timeout on slot 9 > Jul 19 16:46:46 freenas kernel: mvsch11: iec 02000000 sstat 00000123 serr 00000000 edma_s 00001023 dma_c 00000000 dma_s 00000000 rs 00000201 status 40 > Jul 19 16:46:47 freenas root: ZFS: checksum mismatch, zpool=zpmir1 path=/dev/gpt/ada12 offset=52341838336 size=131072 You didn't disclose what FreeBSD version you're running and what your kernel/world build date is. It matters greatly since it will then give us some idea what exact source revision you're using for the mvs(4) driver. "uname -a" output is sufficient. Thanks. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 15:46:38 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 556A71065675 for ; Thu, 21 Jul 2011 15:46:38 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id DADF78FC23 for ; Thu, 21 Jul 2011 15:46:37 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.69) (envelope-from ) id 1QjvSK-0001Q4-Qn for freebsd-fs@freebsd.org; Thu, 21 Jul 2011 17:46:32 +0200 Received: from cpe-188-129-82-57.dynamic.amis.hr ([188.129.82.57]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 21 Jul 2011 17:46:32 +0200 Received: from ivoras by cpe-188-129-82-57.dynamic.amis.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 21 Jul 2011 17:46:32 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Thu, 21 Jul 2011 17:45:53 +0200 Lines: 31 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@dough.gmane.org X-Gmane-NNTP-Posting-Host: cpe-188-129-82-57.dynamic.amis.hr User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 Subject: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 15:46:38 -0000 I'm writing this mostly for future reference / archiving and also if someone has an idea on how to improve the situation. A web server I maintain was hit by DoS, which has caused more than 4 million PHP session files to be created. The session files are sharded in 32 directories in a single level - which is normally more than enough for this web server as the number of users is only a couple of thousand. With the DoS, the number of files per shard directory rose to about 130,000. The problem is: ZFS has proven horribly inefficient with such large directories. I have other, more loaded servers with simlarly bad / large directories on UFS where the problem is not nearly as serious as here (probably due to the large dirhash). On this system, any operation which touches even only the parent of these 32 shards (e.g. "ls") takes seconds, and a simple "find | wc -l" on one of the shards takes > 30 minutes (I stopped it after 30 minutes). Another symptom is that SIGINT-ing such find process takes 10-15 seconds to complete (sic! this likely means the kernel operation cannot be interrupted for so long). This wouldn't be a problem by itself, but operations on such directories eat IOPS - clearly visible with the "find" test case, making the rest of the services on the server fall as collateral damage. Apparently there is a huge amount of seeking being done, even though I would think that for read operations all the data would be cached - and somehow the seeking from this operation takes priority / livelocks other operations on the same ZFS pool. This is on a fresh 8-STABLE AMD64, pool version 28 and zfs version 5. Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. the size of the metadata cache) From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 15:50:35 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E52CD1065676; Thu, 21 Jul 2011 15:50:35 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id BDC1E8FC14; Thu, 21 Jul 2011 15:50:35 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6LFoZEg013066; Thu, 21 Jul 2011 15:50:35 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6LFoZQE013057; Thu, 21 Jul 2011 15:50:35 GMT (envelope-from linimon) Date: Thu, 21 Jul 2011 15:50:35 GMT Message-Id: <201107211550.p6LFoZQE013057@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/159077: [zfs] Can't cd .. with latest zfs version X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 15:50:36 -0000 Old Synopsis: Can't cd .. with latest zfs version New Synopsis: [zfs] Can't cd .. with latest zfs version Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Thu Jul 21 15:50:23 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=159077 From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 15:50:56 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D5943106564A for ; Thu, 21 Jul 2011 15:50:56 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id 82C398FC1D for ; Thu, 21 Jul 2011 15:50:56 +0000 (UTC) Received: by yxl31 with SMTP id 31so843477yxl.13 for ; Thu, 21 Jul 2011 08:50:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=I5raG5x8ED7OmpPEYjFfNM9o2ixN8Jwiffg61/ZxMgI=; b=iIldFMCNRxTHL3cjade3geHv+TuhK3Ym3nnopwN+9PzFLJg4MEZt6PTqXW90sunkIe JxqGlOMrb5tRu3XATFvCQBvlNcGdkVdXuO58JY8VyFpTIEoJh+S5jAFQlMbNZxSS3qMp SUrnqNzI7ErERx5YWHxQVxPCQ68h/iEwdi5Rk= MIME-Version: 1.0 Received: by 10.91.50.32 with SMTP id c32mr837295agk.98.1311263455676; Thu, 21 Jul 2011 08:50:55 -0700 (PDT) Received: by 10.90.209.12 with HTTP; Thu, 21 Jul 2011 08:50:55 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Jul 2011 08:50:55 -0700 Message-ID: From: Freddie Cash To: Ivan Voras Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 15:50:56 -0000 On Thu, Jul 21, 2011 at 8:45 AM, Ivan Voras wrote: > Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. the > size of the metadata cache) > vfs.zfs.arc_meta_limit This sets the amount of ARC that can be used for metadata. The default is 1/8th of ARC, I believe. This setting lets you use "primarycache=all" (store metadata and file data in ARC) but then tune how much is used for each. Not sure if that will help in your case or not, but it's a sysctl you can play with. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 17:03:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 838D2106566B for ; Thu, 21 Jul 2011 17:03:54 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 427778FC12 for ; Thu, 21 Jul 2011 17:03:54 +0000 (UTC) Received: by ywf7 with SMTP id 7so884757ywf.13 for ; Thu, 21 Jul 2011 10:03:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=4B19JgmgUjK/pawyQ0hzh/7ANZ3Njt1avoZPtRu3Bao=; b=Knh72L+ygFvhthwQamjBYGSAzIB2Y+fbwJbIB97q7xQv/X/gNYGGtKp2AH5d8km8wo eYcnyGoGcSMYggliqylsn6HS2JqSORS3ytLvt13+6a1WR1BSOXu9eqbb/UA1umPFgqSI 7kHGK3jXuknYtX/t9NeqHGJbzd7k2zfDLh0hE= Received: by 10.101.180.22 with SMTP id h22mr491681anp.149.1311266323120; Thu, 21 Jul 2011 09:38:43 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 09:38:02 -0700 (PDT) In-Reply-To: References: From: Ivan Voras Date: Thu, 21 Jul 2011 18:38:02 +0200 X-Google-Sender-Auth: RIvcVctju_TcNfF67mhlqJX8VOs Message-ID: To: Freddie Cash Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 17:03:54 -0000 On 21 July 2011 17:50, Freddie Cash wrote: > On Thu, Jul 21, 2011 at 8:45 AM, Ivan Voras wrote: >> >> Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. the >> size of the metadata cache) > > vfs.zfs.arc_meta_limit > > This sets the amount of ARC that can be used for metadata.=C2=A0 The defa= ult is > 1/8th of ARC, I believe.=C2=A0 This setting lets you use "primarycache=3D= all" > (store metadata and file data in ARC) but then tune how much is used for > each. > > Not sure if that will help in your case or not, but it's a sysctl you can > play with. I don't think that it works, or at least is not as efficient as dirhash: www:~> sysctl -a | grep meta kern.metadelay: 28 vfs.zfs.mfu_ghost_metadata_lsize: 129082368 vfs.zfs.mfu_metadata_lsize: 116224 vfs.zfs.mru_ghost_metadata_lsize: 113958912 vfs.zfs.mru_metadata_lsize: 16384 vfs.zfs.anon_metadata_lsize: 0 vfs.zfs.arc_meta_limit: 322412800 vfs.zfs.arc_meta_used: 506907792 kstat.zfs.misc.arcstats.demand_metadata_hits: 4471705 kstat.zfs.misc.arcstats.demand_metadata_misses: 2110328 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 27 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 51 arc_meta_used is nearly 500 MB which should be enough even in this case. With filenames of 32 characters, all the filenames alone for 130,000 files in a directory take about 4 MB - I doubt the ZFS introduces so much extra metadata it doesn't fit in 500 MB. I am now deleting the session files, and I hope it will not take days to complete... From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 17:08:04 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7F27A106564A; Thu, 21 Jul 2011 17:08:04 +0000 (UTC) (envelope-from lists.br@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 2C0E88FC12; Thu, 21 Jul 2011 17:08:03 +0000 (UTC) Received: by gyf3 with SMTP id 3so887971gyf.13 for ; Thu, 21 Jul 2011 10:08:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=r3C/AxEMfCyBD8s47doW/3AodVlSSDCv8GFyeGz5Z38=; b=nzAKuk0I1gygzi2wvb0UZs/V0ad9rlKPyk0ZtDXx1O6BfCi46nmTAxAG7Da7wuzhKB 1eBpJuJUw8/nrcpkFsfzNQPQUBHfsCHzN/dlxOt0OBY7s0Ld8sqFFgzZOCOhC4tbRFFB NOALdlqpAmN1so4NDUwYw+/7N+bNnTbnpA7gU= Received: by 10.236.76.169 with SMTP id b29mr665848yhe.474.1311266333180; Thu, 21 Jul 2011 09:38:53 -0700 (PDT) Received: from [192.168.0.53] ([187.120.139.136]) by mx.google.com with ESMTPS id v4sm1270544yhm.48.2011.07.21.09.38.51 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 21 Jul 2011 09:38:52 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Luiz Otavio O Souza In-Reply-To: Date: Thu, 21 Jul 2011 13:38:50 -0300 Content-Transfer-Encoding: quoted-printable Message-Id: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> References: To: Ivan Voras X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 17:08:04 -0000 On Jul 21, 2011, at 12:45 PM, Ivan Voras wrote: > I'm writing this mostly for future reference / archiving and also if = someone has an idea on how to improve the situation. >=20 > A web server I maintain was hit by DoS, which has caused more than 4 = million PHP session files to be created. The session files are sharded = in 32 directories in a single level - which is normally more than enough = for this web server as the number of users is only a couple of thousand. = With the DoS, the number of files per shard directory rose to about = 130,000. >=20 > The problem is: ZFS has proven horribly inefficient with such large = directories. I have other, more loaded servers with simlarly bad / large = directories on UFS where the problem is not nearly as serious as here = (probably due to the large dirhash). On this system, any operation which = touches even only the parent of these 32 shards (e.g. "ls") takes = seconds, and a simple "find | wc -l" on one of the shards takes > 30 = minutes (I stopped it after 30 minutes). Another symptom is that = SIGINT-ing such find process takes 10-15 seconds to complete (sic! this = likely means the kernel operation cannot be interrupted for so long). >=20 > This wouldn't be a problem by itself, but operations on such = directories eat IOPS - clearly visible with the "find" test case, making = the rest of the services on the server fall as collateral damage. = Apparently there is a huge amount of seeking being done, even though I = would think that for read operations all the data would be cached - and = somehow the seeking from this operation takes priority / livelocks other = operations on the same ZFS pool. >=20 > This is on a fresh 8-STABLE AMD64, pool version 28 and zfs version 5. >=20 > Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. = the size of the metadata cache) Hello Ivan, I've some kind of similar problems on a client that needs to store a = large amount of files. I have 4.194.303 (0x3fffff) files created on FS (unused files are = already created with zero size - this was a precaution from the UFS = times to avoid the 'no more free inodes on FS'). And I just break the files like mybasedir/3f/ff/ff, so under no = circumstance i have a 'big amount of files' in a single directory. The general usage on this server is fine, but the periodic (daily) = scripts take almost a day to complete and the server is slow as hell = while the daily scripts are running. All i need to do is kill 'find' to get the machine back to 'normal'. I did not stopped to look at it in detail, but the little bit i checked, = looks like the stat() calls takes a long time on ZFS files. Previously, we'd this running on UFS with a database of 16.777.215 = (0xffffff) files without any kind of trouble (i've reduced the database = size to keep the daily scripts run time under control). The periodic script is simply doing its job of verifying setuid files = (and comparing the list with the previous one). So, yes, i can confirm that running 'find' on a ZFS FS with a lot of = files is very, very slow (and looks like it isn't related to how the = files are distributed on the FS). But sorry, no idea about how to improve that situation (yet). Regards, Luiz From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 17:18:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 299141065672 for ; Thu, 21 Jul 2011 17:18:48 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id DE39D8FC22 for ; Thu, 21 Jul 2011 17:18:47 +0000 (UTC) Received: by gxk28 with SMTP id 28so891383gxk.13 for ; Thu, 21 Jul 2011 10:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=KpXJU3JUhTGiHgGaDPpw5vgK4sTDCOluqPWxqRyGeRY=; b=D/Dyehy43KR2RIwC/BmrYQG0wF+h/TSWsGvn/2zAC8a9iALXi7PDCWQmLQHul1Hht5 viZLUtRWi2aUf76goAr//dWLsynjak7hwfTb9rH9b3gUq9cpzDuvM4KW5+2hyIVb4ECc 9mHaJOLJFJQsS+1xAw4wFT7dNISCyo1qz0KjE= Received: by 10.101.196.22 with SMTP id y22mr599177anp.17.1311268727126; Thu, 21 Jul 2011 10:18:47 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 10:18:07 -0700 (PDT) In-Reply-To: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> From: Ivan Voras Date: Thu, 21 Jul 2011 19:18:07 +0200 X-Google-Sender-Auth: Fr0nNfqsxEXb_7al86kqBT7k9Ho Message-ID: To: Luiz Otavio O Souza Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 17:18:48 -0000 On 21 July 2011 18:38, Luiz Otavio O Souza wrote: > The general usage on this server is fine, but the periodic (daily) scripts take almost a day to complete and the server is slow as hell while the daily scripts are running. Yes, this is how my problem was first diagnosed. > So, yes, i can confirm that running 'find' on a ZFS FS with a lot of files is very, very slow (and looks like it isn't related to how the files are distributed on the FS). Only it's not just "find" - it's any directory operations - including file creation and removal. I cannot say that is not related to how files are distributed on the file system, except the unusually long operations on the parent of the shard directories in my case. From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 17:26:10 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 99AA81065670 for ; Thu, 21 Jul 2011 17:26:10 +0000 (UTC) (envelope-from universite@ukr.net) Received: from otrada.od.ua (universite-1-pt.tunnel.tserv24.sto1.ipv6.he.net [IPv6:2001:470:27:140::2]) by mx1.freebsd.org (Postfix) with ESMTP id 08B548FC0C for ; Thu, 21 Jul 2011 17:26:09 +0000 (UTC) Received: from [IPv6:2001:470:28:140:11c1:1016:bdbf:959f] ([IPv6:2001:470:28:140:11c1:1016:bdbf:959f]) (authenticated bits=0) by otrada.od.ua (8.14.4/8.14.4) with ESMTP id p6LHQ5om089512 for ; Thu, 21 Jul 2011 20:26:05 +0300 (EEST) (envelope-from universite@ukr.net) Message-ID: <4E28611B.2080707@ukr.net> Date: Thu, 21 Jul 2011 20:25:47 +0300 From: "Vladislav V. Prodan" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2.18) Gecko/20110616 Thunderbird/3.1.11 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-95.5 required=5.0 tests=FREEMAIL_FROM,FSL_RU_URL, RDNS_NONE, SPF_SOFTFAIL, T_TO_NO_BRKTS_FREEMAIL, USER_IN_WHITELIST autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mary-teresa.otrada.od.ua X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (otrada.od.ua [IPv6:2001:470:28:140::5]); Thu, 21 Jul 2011 20:26:08 +0300 (EEST) Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 17:26:10 -0000 21.07.2011 19:38, Ivan Voras wrote: > vfs.zfs.arc_meta_limit: 322412800 > vfs.zfs.arc_meta_used: 506907792 Something values ​​are too small. You can specify the size of RAM? If simple, show: zpool status cat /etc/sysctl.conf | grep -v ^$ | grep -v ^# cat /boot/loader.conf | grep -v ^$ | grep -v ^# -- Vladislav V. Prodan VVP24-UANIC +380[67]4584408 +380[99]4060508 vlad11@jabber.ru From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 18:15:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5C7D1106564A for ; Thu, 21 Jul 2011 18:15:52 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-ww0-f50.google.com (mail-ww0-f50.google.com [74.125.82.50]) by mx1.freebsd.org (Postfix) with ESMTP id BC0148FC1C for ; Thu, 21 Jul 2011 18:15:51 +0000 (UTC) Received: by wwe6 with SMTP id 6so1482282wwe.31 for ; Thu, 21 Jul 2011 11:15:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=IbOC08m+y2hdxAa+gVdRGTA0r6CULEICm7Owwlfw6w0=; b=Ic9r9LyC3AwZ2BRonxU+zejhITNULaAy3RkWNurarQ0YfYvS1jxy/XjZhkMXHOdsD3 4FneMX/Y0811YUqy11VrGHt6eI28Uvvr0BnqL4ej8b5l7CgSDUFY/Z4yzXbQiphpqLN7 GjofYKRBc0uv59DQZ7EI2HBxOt8J5y4OxUUWA= MIME-Version: 1.0 Received: by 10.216.61.198 with SMTP id w48mr1057479wec.40.1311272150465; Thu, 21 Jul 2011 11:15:50 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.216.46.18 with HTTP; Thu, 21 Jul 2011 11:15:50 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Jul 2011 11:15:50 -0700 X-Google-Sender-Auth: uF_yU18ClNLbaK62epFGPiz85yc Message-ID: From: Artem Belevich To: Ivan Voras Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 18:15:52 -0000 On Thu, Jul 21, 2011 at 9:38 AM, Ivan Voras wrote: > On 21 July 2011 17:50, Freddie Cash wrote: >> On Thu, Jul 21, 2011 at 8:45 AM, Ivan Voras wrote: >>> >>> Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. the >>> size of the metadata cache) >> >> vfs.zfs.arc_meta_limit >> >> This sets the amount of ARC that can be used for metadata.=A0 The defaul= t is >> 1/8th of ARC, I believe.=A0 This setting lets you use "primarycache=3Dal= l" >> (store metadata and file data in ARC) but then tune how much is used for >> each. >> >> Not sure if that will help in your case or not, but it's a sysctl you ca= n >> play with. > > I don't think that it works, or at least is not as efficient as dirhash: > > www:~> sysctl -a | grep meta > kern.metadelay: 28 > vfs.zfs.mfu_ghost_metadata_lsize: 129082368 > vfs.zfs.mfu_metadata_lsize: 116224 > vfs.zfs.mru_ghost_metadata_lsize: 113958912 > vfs.zfs.mru_metadata_lsize: 16384 > vfs.zfs.anon_metadata_lsize: 0 > vfs.zfs.arc_meta_limit: 322412800 > vfs.zfs.arc_meta_used: 506907792 > kstat.zfs.misc.arcstats.demand_metadata_hits: 4471705 > kstat.zfs.misc.arcstats.demand_metadata_misses: 2110328 > kstat.zfs.misc.arcstats.prefetch_metadata_hits: 27 > kstat.zfs.misc.arcstats.prefetch_metadata_misses: 51 > > arc_meta_used is nearly 500 MB which should be enough even in this > case. With filenames of 32 characters, all the filenames alone for > 130,000 files in a directory take about 4 MB - I doubt the ZFS > introduces so much extra metadata it doesn't fit in 500 MB. For what it's worth, 500K files in one directory seems to work reasonably well on my box running few weeks old 8-stable (quad core 8GB RAM, ~6GB ARC), ZFSv28 pool on a 2-drive mirror + 50GB L2ARC. $ time perl -e 'use Fcntl; for $f (1..500000) {sysopen(FH,"f$f",O_CREAT); close(FH);}' perl -e >| /dev/null 2.26s user 39.17s system 96% cpu 43.156 total $ time find . |wc -l 500001 find . 0.16s user 0.33s system 99% cpu 0.494 total $ time find . -ls |wc -l 500001 find . -ls 1.93s user 12.13s system 96% cpu 14.643 total time find . |xargs -n 100 rm find . 0.22s user 0.28s system 0% cpu 2:45.12 total xargs -n 100 rm 1.25s user 58.51s system 36% cpu 2:45.61 total Deleting files resulted in a constant stream of writes to hard drives. I guess file deletion may end up up being a synchronous write committed to ZIL right away. If that's indeed the case, small slog on SSD could probably speed up file deletion a bit. --Artem > > I am now deleting the session files, and I hope it will not take days > to complete... > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 18:25:25 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 6A5361065670; Thu, 21 Jul 2011 18:25:25 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id A42648FC15; Thu, 21 Jul 2011 18:25:24 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 53A7115E8B5; Thu, 21 Jul 2011 20:25:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id sEjGCTkL5p+i; Thu, 21 Jul 2011 20:25:20 +0200 (CEST) Received: from [10.9.8.3] (chello085216231078.chello.sk [85.216.231.78]) by mail.vx.sk (Postfix) with ESMTPSA id E0F7F15E8A5; Thu, 21 Jul 2011 20:25:19 +0200 (CEST) Message-ID: <4E286F1F.6010502@FreeBSD.org> Date: Thu, 21 Jul 2011 20:25:35 +0200 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Ivan Voras References: In-Reply-To: X-Enigmail-Version: 1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 18:25:25 -0000 Quoting: ... The default record size ZFS utilizes is 128K, which is good for many storage servers that will harbor larger files. However, when dealing with many files that are only a matter of tens of kilobytes, or even bytes, considerable slowdown will result. ZFS can easily alter the record size of the data to be written through the use of attributes. These attributes can be set at any time through the use of the "zfs set" command. To set the record size attribute perform "zfs set recordsize=32K pool/share". This will set the recordsize to 32K on share "share" within pool "pool". This type of functionality can even be implemented on nested shares for even more flexibility. ... Read more: http://www.articlesbase.com/information-technology-articles/improving-file-system-performance-utilizing-dynamic-record-sizes-in-zfs-4565092.html#ixzz1SlWZ7BM5 Dn(a 21. 7. 2011 17:45, Ivan Voras wrote / napísal(a): > I'm writing this mostly for future reference / archiving and also if > someone has an idea on how to improve the situation. > > A web server I maintain was hit by DoS, which has caused more than 4 > million PHP session files to be created. The session files are sharded > in 32 directories in a single level - which is normally more than > enough for this web server as the number of users is only a couple of > thousand. With the DoS, the number of files per shard directory rose > to about 130,000. > > The problem is: ZFS has proven horribly inefficient with such large > directories. I have other, more loaded servers with simlarly bad / > large directories on UFS where the problem is not nearly as serious as > here (probably due to the large dirhash). On this system, any > operation which touches even only the parent of these 32 shards (e.g. > "ls") takes seconds, and a simple "find | wc -l" on one of the shards > takes > 30 minutes (I stopped it after 30 minutes). Another symptom is > that SIGINT-ing such find process takes 10-15 seconds to complete > (sic! this likely means the kernel operation cannot be interrupted for > so long). > > This wouldn't be a problem by itself, but operations on such > directories eat IOPS - clearly visible with the "find" test case, > making the rest of the services on the server fall as collateral > damage. Apparently there is a huge amount of seeking being done, even > though I would think that for read operations all the data would be > cached - and somehow the seeking from this operation takes priority / > livelocks other operations on the same ZFS pool. > > This is on a fresh 8-STABLE AMD64, pool version 28 and zfs version 5. > > Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. > the size of the metadata cache) > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 18:32:24 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B9B071065676; Thu, 21 Jul 2011 18:32:24 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 65D2E8FC14; Thu, 21 Jul 2011 18:32:24 +0000 (UTC) Received: by gxk28 with SMTP id 28so929957gxk.13 for ; Thu, 21 Jul 2011 11:32:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=8f7SvxYEU30L/Y8mJ2s3mTr0Zm7KK0+RN9M5kB//888=; b=xk3tZWpUpmuCEvIDC3JSEOmhSb1CEtyQ9g7GY99dVu6gNXzx7v5cbZK0dH2ApUV3vU RP8aV3obM4Mtmek0DmJRMVrzCE72mQUAfCHcAC6RbK7wVgXJP+nMnyFQqshElWPrVM4j jC6bvj3OqCYXWbtkcVjtdNRvznUd1h1ozbzQQ= MIME-Version: 1.0 Received: by 10.91.10.8 with SMTP id n8mr1062153agi.17.1311273143326; Thu, 21 Jul 2011 11:32:23 -0700 (PDT) Received: by 10.90.209.12 with HTTP; Thu, 21 Jul 2011 11:32:23 -0700 (PDT) In-Reply-To: <4E286F1F.6010502@FreeBSD.org> References: <4E286F1F.6010502@FreeBSD.org> Date: Thu, 21 Jul 2011 11:32:23 -0700 Message-ID: From: Freddie Cash To: Martin Matuska Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 18:32:24 -0000 On Thu, Jul 21, 2011 at 11:25 AM, Martin Matuska wrote: > Quoting: > ... The default record size ZFS utilizes is 128K, which is good for many > storage servers that will harbor larger files. However, when dealing > with many files that are only a matter of tens of kilobytes, or even > bytes, considerable slowdown will result. ZFS can easily alter the > record size of the data to be written through the use of attributes. > These attributes can be set at any time through the use of the "zfs set" > command. To set the record size attribute perform "zfs set > recordsize=32K pool/share". This will set the recordsize to 32K on share > "share" within pool "pool". This type of functionality can even be > implemented on nested shares for even more flexibility. ... > > The recordsize property in ZFS is the "max" block size used. It is not the only block size used for a dataset. ZFS will use any block size from 0.5 KB to $recordsize KB, as determined by the size of the file to be written (it tries to the find the recordsize that most closely matches the file size to use the least number of blocks per write). It's only on ZVols that the recordsize==the block size, and all writes are fixed in size. Have a look through "zdb -dd poolname" to see the spread of block sizes in the pool. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 19:21:25 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F735106564A; Thu, 21 Jul 2011 19:21:25 +0000 (UTC) (envelope-from dpd@bitgravity.com) Received: from mail1.sjc1.bitgravity.com (mail1.sjc1.bitgravity.com [209.131.97.19]) by mx1.freebsd.org (Postfix) with ESMTP id 322C18FC08; Thu, 21 Jul 2011 19:21:25 +0000 (UTC) Received: from mail-pz0-f52.google.com ([209.85.210.52]) by mail1.sjc1.bitgravity.com with esmtps (TLSv1:RC4-SHA:128) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1QjyIG-000HLX-SY; Thu, 21 Jul 2011 11:48:20 -0700 Received: by pzd13 with SMTP id 13so2062740pzd.25 for ; Thu, 21 Jul 2011 11:48:15 -0700 (PDT) Received: by 10.68.12.133 with SMTP id y5mr808908pbb.104.1311274095244; Thu, 21 Jul 2011 11:48:15 -0700 (PDT) Received: from netops-173.sfo1.bitgravity.com (netops-173.sfo1.bitgravity.com [209.131.110.173]) by mx.google.com with ESMTPS id d3sm1061911pbh.85.2011.07.21.11.48.13 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 21 Jul 2011 11:48:14 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: David P Discher In-Reply-To: Date: Thu, 21 Jul 2011 11:48:12 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> To: Ivan Voras X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 19:21:25 -0000 Ivan - What's your uptime ? Are you using l2 arc ? what is the value of 'sysctl kstat.zfs.misc.arcstats.evict_skip' ? is this increasing quickly ? How much cpu are the 'arc_reclaim_thread' and 'l2arc_feed_thread' taking = up ? top -SHb 500 | grep arc --- David P. Discher dpd@bitgravity.com * AIM: bgDavidDPD BITGRAVITY * http://www.bitgravity.com On Jul 21, 2011, at 10:18 AM, Ivan Voras wrote: > On 21 July 2011 18:38, Luiz Otavio O Souza wrote: >=20 >> The general usage on this server is fine, but the periodic (daily) = scripts take almost a day to complete and the server is slow as hell = while the daily scripts are running. >=20 > Yes, this is how my problem was first diagnosed. >=20 >> So, yes, i can confirm that running 'find' on a ZFS FS with a lot of = files is very, very slow (and looks like it isn't related to how the = files are distributed on the FS). >=20 > Only it's not just "find" - it's any directory operations - including > file creation and removal. I cannot say that is not related to how > files are distributed on the file system, except the unusually long > operations on the parent of the shard directories in my case. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 19:29:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A791106566C; Thu, 21 Jul 2011 19:29:49 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id 03E7E8FC13; Thu, 21 Jul 2011 19:29:48 +0000 (UTC) Received: by gyf3 with SMTP id 3so963204gyf.13 for ; Thu, 21 Jul 2011 12:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=BI6C32Gpfx8Up/Tp5DkW5eInHWp1lU9+ZobRBCHWxho=; b=aoalgSvmk5Uru06UJl2QS0j/Xd3rPfhvyGwZvYRYfk/0mBXgt2fh3k/kyXVJTvJGaP pluKoy/40au0cFgwYkCTrbGUn3HztUBjENoWdR7HM9LqI7/Elf0Sw0Zo5UY2KuZENyyN LaUW+NSdfSBy9PlzltsB2E5uY5s5W8zdK7dl4= Received: by 10.101.18.6 with SMTP id v6mr732356ani.39.1311276588174; Thu, 21 Jul 2011 12:29:48 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 12:29:07 -0700 (PDT) In-Reply-To: References: From: Ivan Voras Date: Thu, 21 Jul 2011 21:29:07 +0200 X-Google-Sender-Auth: BuLkf_WYk6p9AzB33n_bWy3pdgE Message-ID: To: Artem Belevich Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 19:29:49 -0000 On 21 July 2011 20:15, Artem Belevich wrote: > On Thu, Jul 21, 2011 at 9:38 AM, Ivan Voras wrote: >> On 21 July 2011 17:50, Freddie Cash wrote: >>> On Thu, Jul 21, 2011 at 8:45 AM, Ivan Voras wrote: >>>> >>>> Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. th= e >>>> size of the metadata cache) >>> >>> vfs.zfs.arc_meta_limit >>> >>> This sets the amount of ARC that can be used for metadata.=C2=A0 The de= fault is >>> 1/8th of ARC, I believe.=C2=A0 This setting lets you use "primarycache= =3Dall" >>> (store metadata and file data in ARC) but then tune how much is used fo= r >>> each. >>> >>> Not sure if that will help in your case or not, but it's a sysctl you c= an >>> play with. >> >> I don't think that it works, or at least is not as efficient as dirhash: >> >> www:~> sysctl -a | grep meta >> kern.metadelay: 28 >> vfs.zfs.mfu_ghost_metadata_lsize: 129082368 >> vfs.zfs.mfu_metadata_lsize: 116224 >> vfs.zfs.mru_ghost_metadata_lsize: 113958912 >> vfs.zfs.mru_metadata_lsize: 16384 >> vfs.zfs.anon_metadata_lsize: 0 >> vfs.zfs.arc_meta_limit: 322412800 >> vfs.zfs.arc_meta_used: 506907792 >> kstat.zfs.misc.arcstats.demand_metadata_hits: 4471705 >> kstat.zfs.misc.arcstats.demand_metadata_misses: 2110328 >> kstat.zfs.misc.arcstats.prefetch_metadata_hits: 27 >> kstat.zfs.misc.arcstats.prefetch_metadata_misses: 51 >> >> arc_meta_used is nearly 500 MB which should be enough even in this >> case. With filenames of 32 characters, all the filenames alone for >> 130,000 files in a directory take about 4 MB - I doubt the ZFS >> introduces so much extra metadata it doesn't fit in 500 MB. > > For what it's worth, 500K files in one directory seems to work > reasonably well on my box running few weeks old 8-stable (quad core > 8GB RAM, ~6GB ARC), ZFSv28 pool on a 2-drive mirror + 50GB L2ARC. > > $ time perl -e 'use Fcntl; for $f =C2=A0(1..500000) > {sysopen(FH,"f$f",O_CREAT); close(FH);}' > perl -e =C2=A0>| /dev/null =C2=A02.26s user 39.17s system 96% cpu 43.156 = total > > $ time find . |wc -l > =C2=A0500001 > find . =C2=A00.16s user 0.33s system 99% cpu 0.494 total > > $ time find . -ls |wc -l > =C2=A0500001 > find . -ls =C2=A01.93s user 12.13s system 96% cpu 14.643 total > > time find . |xargs -n 100 rm > find . =C2=A00.22s user 0.28s system 0% cpu 2:45.12 total > xargs -n 100 rm =C2=A01.25s user 58.51s system 36% cpu 2:45.61 total > > Deleting files resulted in a constant stream of writes to hard drives. > I guess file deletion may end up up being a synchronous write > committed to ZIL right away. If that's indeed the case, small slog on > SSD could probably speed up file deletion a bit. That's a very interesting find. Or maybe the issue is fragmentation: could you modify the script slightly to create files in about 50 directories in parallel (i.e. create in dir1, create in dir2, create in dir3... create in dir 50, then again create in dir1, create in dir2...)? Could you for the sake of curiosity upgrate this system to the latest 8-stable and retry it? From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 19:35:14 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2C556106566C for ; Thu, 21 Jul 2011 19:35:14 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gy0-f182.google.com (mail-gy0-f182.google.com [209.85.160.182]) by mx1.freebsd.org (Postfix) with ESMTP id DCF758FC15 for ; Thu, 21 Jul 2011 19:35:13 +0000 (UTC) Received: by gyf3 with SMTP id 3so965979gyf.13 for ; Thu, 21 Jul 2011 12:35:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=jnL5Id2o71AST+GS5CEmDp9/rDZGRez1dr+w1sMmGAs=; b=GFuE0e6PcSsfI/7iWaS2uYwsP7V2o06HRQnBCjJr3SLI6rQT5XN3TG4in24FrQumlO MXXJeon6kaGeVxq0PNbt1ajsneN1hKck9d/7PjdehPTQbb5nTrI+lBP0beaiE8IVbctA UYgQYphT/mtKJlW+thwQshCX9uH27H4DSE0Ls= Received: by 10.100.52.3 with SMTP id z3mr706405anz.127.1311276913131; Thu, 21 Jul 2011 12:35:13 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 12:34:33 -0700 (PDT) In-Reply-To: References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> From: Ivan Voras Date: Thu, 21 Jul 2011 21:34:33 +0200 X-Google-Sender-Auth: Upz8yv_0HHkV7vN6nuFATF6Itec Message-ID: To: David P Discher Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 19:35:14 -0000 On 21 July 2011 20:48, David P Discher wrote: > > Ivan - > > What's your uptime ? Currently about a day since other people rebooted the server in panic. > Are you using l2 arc ? As a separate device? No. > what is the value of 'sysctl kstat.zfs.misc.arcstats.evict_skip' ? > =C2=A0is this increasing quickly ? www:~> sysctl kstat.zfs.misc.arcstats.evict_skip kstat.zfs.misc.arcstats.evict_skip: 4442021 www:~> sysctl kstat.zfs.misc.arcstats.evict_skip kstat.zfs.misc.arcstats.evict_skip: 4442410 It increases (like the two values I've given) about every 5 seconds, which would mean approx. 70 per second. > How much cpu are the 'arc_reclaim_thread' and 'l2arc_feed_thread' taking = up ? > =C2=A0top -SHb 500 | grep arc Almost none. 36 root -8 - 0K 80K arc_re 0 0:41 0.00% {arc_reclaim_thre} 36 root -8 - 0K 80K l2arc_ 0 0:01 0.00% {l2arc_feed_threa} From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 19:36:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EBBF8106566C; Thu, 21 Jul 2011 19:36:54 +0000 (UTC) (envelope-from lists.br@gmail.com) Received: from mail-gx0-f182.google.com (mail-gx0-f182.google.com [209.85.161.182]) by mx1.freebsd.org (Postfix) with ESMTP id 8FC428FC16; Thu, 21 Jul 2011 19:36:54 +0000 (UTC) Received: by gxk28 with SMTP id 28so963673gxk.13 for ; Thu, 21 Jul 2011 12:36:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=bEcGay190T+wbUOE7hPnnKsVjvRQ7hYgE5nD65OP+M4=; b=ierofPsGdG8T93o3+OPuicRTh0yJtD+QrluWTn70qjLHy3ZpHijTimEkr7LIa6xBnC NE9op0PMhQlm4rXx/8OL1FffOb/iT60U7tlw1mAXJ8D0LVyGwI9/ndGky2EVylewPKXX EttnLNSp90lzSpSo8mSMEXVkDG9GZcktiBrpk= Received: by 10.150.201.12 with SMTP id y12mr1105421ybf.53.1311277013679; Thu, 21 Jul 2011 12:36:53 -0700 (PDT) Received: from [192.168.0.53] ([187.120.139.136]) by mx.google.com with ESMTPS id h6sm1295330ybi.20.2011.07.21.12.36.51 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 21 Jul 2011 12:36:52 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Luiz Otavio O Souza In-Reply-To: Date: Thu, 21 Jul 2011 16:36:49 -0300 Content-Transfer-Encoding: quoted-printable Message-Id: <5542D910-0C5C-4B2B-885F-CC92901367F0@gmail.com> References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> To: David P Discher X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org, Ivan Voras Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 19:36:55 -0000 On Jul 21, 2011, at 3:48 PM, David P Discher wrote: >=20 > Ivan - >=20 > What's your uptime ? > Are you using l2 arc ? > what is the value of 'sysctl kstat.zfs.misc.arcstats.evict_skip' ? > is this increasing quickly ? >=20 Don't know about Ivan's case, but mine is definitively increasing it = quickly (and i'm not using l2arc): sysctl kstat.zfs.misc.arcstats.evict_skip = = =20 kstat.zfs.misc.arcstats.evict_skip: 129995601 And just a few minutes later (while running find on my 4 million files = FS): sysctl kstat.zfs.misc.arcstats.evict_skip kstat.zfs.misc.arcstats.evict_skip: 130589384 But i guess i need to increase the arc_meta_limit as well: vfs.zfs.arc_meta_limit: 536870912 vfs.zfs.arc_meta_used: 579461312 kstat.zfs.misc.arcstats.demand_data_hits: 4400985059 kstat.zfs.misc.arcstats.demand_data_misses: 699262 kstat.zfs.misc.arcstats.demand_metadata_hits: 1057208432 kstat.zfs.misc.arcstats.demand_metadata_misses: 32782389 kstat.zfs.misc.arcstats.prefetch_data_hits: 3302738888 kstat.zfs.misc.arcstats.prefetch_data_misses: 225108 kstat.zfs.misc.arcstats.prefetch_metadata_hits: 418744564 kstat.zfs.misc.arcstats.prefetch_metadata_misses: 147815306 kstat.zfs.misc.arcstats.evict_skip: 130781386 kstat.zfs.misc.arcstats.evict_l2_cached: 0 kstat.zfs.misc.arcstats.evict_l2_eligible: 2514187700736 kstat.zfs.misc.arcstats.evict_l2_ineligible: 176966735360 Unfortunately i need to wait a little bit until i can reboot this server = with the new sysctl values. Thanks everyone for the hints so far. Luiz= From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 19:40:59 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7D794106566B for ; Thu, 21 Jul 2011 19:40:59 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 07FAB8FC12 for ; Thu, 21 Jul 2011 19:40:58 +0000 (UTC) Received: by gwb15 with SMTP id 15so1402663gwb.13 for ; Thu, 21 Jul 2011 12:40:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=NEu1niXbfyQ21WzCI1eEhgJINGl3v9MLnvDWwTjFwaM=; b=pR6kaXHVU9TWymTFJKp95kd35E6SS+NIS+LyumbooOFXGxknFb98C/lbnsts2l2LSI A9Y/a1vwzhgvWhy0r0TOFwhEOb6YZ5uKqp3yiV/4zRPtWWQazfqXwZa99ZGpssu39fvI nxoKMvmi6BZZKMOJKC41ha06WhxHCQQMe8yDM= Received: by 10.100.52.3 with SMTP id z3mr712284anz.127.1311277258245; Thu, 21 Jul 2011 12:40:58 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 12:40:18 -0700 (PDT) In-Reply-To: <4E286F1F.6010502@FreeBSD.org> References: <4E286F1F.6010502@FreeBSD.org> From: Ivan Voras Date: Thu, 21 Jul 2011 21:40:18 +0200 X-Google-Sender-Auth: zcfgMpLQ0P2_CWWFm3SZWRbe2nA Message-ID: To: Martin Matuska Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 19:40:59 -0000 On 21 July 2011 20:25, Martin Matuska wrote: > Quoting: > ... The default record size ZFS utilizes is 128K, which is good for many > storage servers that will harbor larger files. However, when dealing with > many files that are only a matter of tens of kilobytes, or even bytes, > considerable slowdown will result. ZFS can easily alter the record size of > the data to be written through the use of attributes. These attributes can > be set at any time through the use of the "zfs set" command. To set the > record size attribute perform "zfs set recordsize=32K pool/share". This will > set the recordsize to 32K on share "share" within pool "pool". This type of > functionality can even be implemented on nested shares for even more > flexibility. ... > > Read more: > http://www.articlesbase.com/information-technology-articles/improving-file-system-performance-utilizing-dynamic-record-sizes-in-zfs-4565092.html#ixzz1SlWZ7BM5 Thank you very much - now if only you took as much effort to explain the possible connection between your quote and my post as it took you to find the quote :) As others explained, ZFS definitely does not use fixed block sizes, which I can easily confirm for you by running iostat: www:~> iostat 1 tty ad4 ad6 ad8 cpu tin tout KB/t tps MB/s KB/t tps MB/s KB/t tps MB/s us ni sy in id 0 32 5.87 123 0.70 5.88 122 0.70 5.99 121 0.71 4 0 8 0 88 0 233 1.54 171 0.26 1.54 166 0.25 1.53 172 0.26 10 0 17 1 72 0 78 7.34 181 1.30 7.02 191 1.31 7.46 177 1.29 0 0 8 0 92 0 78 5.50 234 1.25 5.59 232 1.26 5.27 247 1.27 0 0 10 1 89 ^C KB/t varies and is small. Now I'm working under the hypothesis that the directory pseudo-file itself is hugely fragmented and ZFS fragments it even more every time it adds or removes from it. Any ideas how to verify this? From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 19:45:42 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A786106566B for ; Thu, 21 Jul 2011 19:45:42 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 19A678FC08 for ; Thu, 21 Jul 2011 19:45:41 +0000 (UTC) Received: by gwb15 with SMTP id 15so1404857gwb.13 for ; Thu, 21 Jul 2011 12:45:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=oy+9rvbauLPEYSVpN5xK0pVeWnGN8f4PWZXqIxuTwu8=; b=V2QQw6OQOA+sVBGfW40Jv2J0/7cdEFm2DKgTjUL62HUQhAX96fs4x1yedFieaiAhJg nVmxwsege+wzh9kRlS4laT7/yDyj/MS424Ia9Q15AN3cNyzWb31z98bPwnfN+cppSTLs umxWG1jSAu3G4d/1yVDxZkBOGX1mue887M1xU= Received: by 10.100.233.21 with SMTP id f21mr710190anh.83.1311277541341; Thu, 21 Jul 2011 12:45:41 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 12:45:01 -0700 (PDT) In-Reply-To: <5542D910-0C5C-4B2B-885F-CC92901367F0@gmail.com> References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> <5542D910-0C5C-4B2B-885F-CC92901367F0@gmail.com> From: Ivan Voras Date: Thu, 21 Jul 2011 21:45:01 +0200 X-Google-Sender-Auth: Mmr_1scnRdik9vl9eBCgIj2JBSQ Message-ID: To: Luiz Otavio O Souza Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 19:45:42 -0000 On 21 July 2011 21:36, Luiz Otavio O Souza wrote: > But i guess i need to increase the arc_meta_limit as well: > > vfs.zfs.arc_meta_limit: 536870912 > vfs.zfs.arc_meta_used: 579461312 You also have arc_meta_used larger than arc_meta_limit ... but not nearly as big a difference as on my system. Can anyone speculate if raising vfs.zfs.arc_meta_limit would help? From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 20:04:47 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5F7E7106564A for ; Thu, 21 Jul 2011 20:04:47 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id 1C9A28FC0C for ; Thu, 21 Jul 2011 20:04:46 +0000 (UTC) Received: by yxl31 with SMTP id 31so979168yxl.13 for ; Thu, 21 Jul 2011 13:04:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=4Gwf1YkwcRcUXptqAgWUlmp7T2KcK0ZKrmLLlrPsUYI=; b=Y6BJ3+9VbOR5Ic9CeMG55ZZkxBhTQnFBlrdYvWQbeZhhl/8gdRqGjdE3YNS8ZwF6x9 Sg89udkec7zt2WYAKzx0XA6Jj5JGouzCBoZN8AjYK4OEv5SYhfsDJRrgKLSlaEEmvdrM i6SnmLPUFDof/zP1M/RfLzFCoYvO94OhroQm0= Received: by 10.101.180.22 with SMTP id h22mr726829anp.149.1311278686273; Thu, 21 Jul 2011 13:04:46 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 13:04:06 -0700 (PDT) In-Reply-To: References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> <5542D910-0C5C-4B2B-885F-CC92901367F0@gmail.com> From: Ivan Voras Date: Thu, 21 Jul 2011 22:04:06 +0200 X-Google-Sender-Auth: pQmXl6Ac2EOQXWFtkBRuDGr8xAg Message-ID: To: Luiz Otavio O Souza Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 20:04:47 -0000 On 21 July 2011 21:45, Ivan Voras wrote: > On 21 July 2011 21:36, Luiz Otavio O Souza wrote: > >> But i guess i need to increase the arc_meta_limit as well: >> >> vfs.zfs.arc_meta_limit: 536870912 >> vfs.zfs.arc_meta_used: 579461312 > > You also have arc_meta_used larger than arc_meta_limit ... but not > nearly as big a difference as on my system. > > Can anyone speculate if raising vfs.zfs.arc_meta_limit would help? Well, it didn't help me - I raised it above what used to be arc_meta_used and after the reboot arc_meta_used simply rose again over arc_meta_limit. Here's another "symptom": while "find" is running, I do a ls of my home directory on the same zpool and get delays, always at the same place (after the postgresql source file): www:~> ll total 233 drwxrwxr-x 2 ivoras ivoras 8 Jun 1 2009 backup/ -rw-rw-r-- 1 ivoras ivoras 593 Nov 7 2007 c1.php -rw-rw-r-- 1 ivoras ivoras 37682863 Apr 30 2009 cms.tgz drwxrwxr-x 4 ivoras ivoras 4 Feb 12 2008 devel/ -rw-r--r-- 1 ivoras ivoras 44372 May 24 2007 etcdirs.tgz -rw-r--r-- 1 root ivoras 215397 Nov 22 2007 lock_profile.txt -rw-r--r-- 1 ivoras ivoras 18336 Nov 21 2007 lockmgr.diff -rw-rw-r-- 1 ivoras ivoras 32590585 Oct 31 2007 melc.sql -rw-r----- 1 ivoras ivoras 1712 Oct 15 2008 newreq.pem -rw-rw-r-- 1 root ivoras 3330572 Apr 30 2009 postgresql-server-8.3.1.tbz load: 0.38 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 3.25r 0.00u 0.01s 0% 2140k load: 0.67 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 5.00r 0.00u 0.01s 0% 2140k load: 0.67 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 5.47r 0.00u 0.01s 0% 2140k load: 0.67 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 6.34r 0.00u 0.01s 0% 2140k load: 0.67 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 8.30r 0.00u 0.01s 0% 2140k load: 0.67 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 8.70r 0.00u 0.01s 0% 2140k load: 0.67 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 9.17r 0.00u 0.01s 0% 2140k load: 0.70 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 11.08r 0.00u 0.01s 0% 2140k load: 0.70 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 11.54r 0.00u 0.01s 0% 2140k load: 0.70 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 13.00r 0.00u 0.01s 0% 2140k load: 0.70 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 13.49r 0.00u 0.01s 0% 2140k load: 0.70 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 14.10r 0.00u 0.01s 0% 2140k load: 0.64 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 14.62r 0.00u 0.01s 0% 2140k load: 0.64 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 17.41r 0.00u 0.01s 0% 2140k load: 0.64 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 19.74r 0.00u 0.01s 0% 2140k load: 0.75 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 20.53r 0.00u 0.01s 0% 2140k load: 0.75 cmd: ls 1786 [tx->tx_quiesce_done_cv)] 21.30r 0.00u 0.01s 0% 2140k lrwxrwxr-x 1 ivoras ivoras 9 Apr 23 2009 services@ -> /services -rw-rw-r-- 1 ivoras ivoras 160060550 Oct 31 2007 services.tgz -rw-rw-r-- 1 ivoras ivoras 0 Nov 1 2007 stress.txt drwxrwxr-x 5 ivoras ivoras 8 Mar 10 2010 temp/ -rw-rw-r-- 1 ivoras ivoras 257002 Oct 31 2007 ule.tgz -rw-r--r-- 1 ivoras ivoras 8965978 May 24 2007 wwwdirs.tgz The "load.." lines in between are me hitting Ctrl-T for process info. Observe almost 20 seconds delay in the middle of "ls"! From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 20:07:21 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5A832106566B; Thu, 21 Jul 2011 20:07:21 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id B94058FC1A; Thu, 21 Jul 2011 20:07:20 +0000 (UTC) Received: by wyg24 with SMTP id 24so1436271wyg.13 for ; Thu, 21 Jul 2011 13:07:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=93WVlszkRQgAUn5Em87RF0XcAtLrSuhlenWfL8FjdMw=; b=PkD9pN0ks6mPG/XNVNrs/DmWcLPpHLlG37eHHuknBlf48IAZfdFV4W/zPklOG1CtL3 V74ABD/t5DciBsYb1kI8pKyLDAs0ot7riqIcaHo9deCRHSnm6tl4d1qkdBOH5nCYFEQw KyWa0fwRlK1WVdA1SAgC/HcFIOP9hHJeG6K8U= MIME-Version: 1.0 Received: by 10.217.6.79 with SMTP id x57mr1147262wes.10.1311278839758; Thu, 21 Jul 2011 13:07:19 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.216.46.18 with HTTP; Thu, 21 Jul 2011 13:07:19 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Jul 2011 13:07:19 -0700 X-Google-Sender-Auth: Q2JPUJ3gMS3Nwm78ttZpT7tNqHA Message-ID: From: Artem Belevich To: Ivan Voras Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 20:07:21 -0000 On Thu, Jul 21, 2011 at 12:29 PM, Ivan Voras wrote: > On 21 July 2011 20:15, Artem Belevich wrote: >> On Thu, Jul 21, 2011 at 9:38 AM, Ivan Voras wrote: >>> On 21 July 2011 17:50, Freddie Cash wrote: >>>> On Thu, Jul 21, 2011 at 8:45 AM, Ivan Voras wrote= : >>>>> >>>>> Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. t= he >>>>> size of the metadata cache) >>>> >>>> vfs.zfs.arc_meta_limit >>>> >>>> This sets the amount of ARC that can be used for metadata.=A0 The defa= ult is >>>> 1/8th of ARC, I believe.=A0 This setting lets you use "primarycache=3D= all" >>>> (store metadata and file data in ARC) but then tune how much is used f= or >>>> each. >>>> >>>> Not sure if that will help in your case or not, but it's a sysctl you = can >>>> play with. >>> >>> I don't think that it works, or at least is not as efficient as dirhash= : >>> >>> www:~> sysctl -a | grep meta >>> kern.metadelay: 28 >>> vfs.zfs.mfu_ghost_metadata_lsize: 129082368 >>> vfs.zfs.mfu_metadata_lsize: 116224 >>> vfs.zfs.mru_ghost_metadata_lsize: 113958912 >>> vfs.zfs.mru_metadata_lsize: 16384 >>> vfs.zfs.anon_metadata_lsize: 0 >>> vfs.zfs.arc_meta_limit: 322412800 >>> vfs.zfs.arc_meta_used: 506907792 >>> kstat.zfs.misc.arcstats.demand_metadata_hits: 4471705 >>> kstat.zfs.misc.arcstats.demand_metadata_misses: 2110328 >>> kstat.zfs.misc.arcstats.prefetch_metadata_hits: 27 >>> kstat.zfs.misc.arcstats.prefetch_metadata_misses: 51 >>> >>> arc_meta_used is nearly 500 MB which should be enough even in this >>> case. With filenames of 32 characters, all the filenames alone for >>> 130,000 files in a directory take about 4 MB - I doubt the ZFS >>> introduces so much extra metadata it doesn't fit in 500 MB. >> >> For what it's worth, 500K files in one directory seems to work >> reasonably well on my box running few weeks old 8-stable (quad core >> 8GB RAM, ~6GB ARC), ZFSv28 pool on a 2-drive mirror + 50GB L2ARC. >> >> $ time perl -e 'use Fcntl; for $f =A0(1..500000) >> {sysopen(FH,"f$f",O_CREAT); close(FH);}' >> perl -e =A0>| /dev/null =A02.26s user 39.17s system 96% cpu 43.156 total >> >> $ time find . |wc -l >> =A0500001 >> find . =A00.16s user 0.33s system 99% cpu 0.494 total >> >> $ time find . -ls |wc -l >> =A0500001 >> find . -ls =A01.93s user 12.13s system 96% cpu 14.643 total >> >> time find . |xargs -n 100 rm >> find . =A00.22s user 0.28s system 0% cpu 2:45.12 total >> xargs -n 100 rm =A01.25s user 58.51s system 36% cpu 2:45.61 total >> >> Deleting files resulted in a constant stream of writes to hard drives. >> I guess file deletion may end up up being a synchronous write >> committed to ZIL right away. If that's indeed the case, small slog on >> SSD could probably speed up file deletion a bit. > > That's a very interesting find. > > Or maybe the issue is fragmentation: could you modify the script > slightly to create files in about 50 directories in parallel (i.e. > create in dir1, create in dir2, create in dir3... create in dir 50, > then again create in dir1, create in dir2...)? Scattering across 50 directories works about as fast: $ time perl -e 'use Fcntl; $dir =3D 0; for $f (1..500000) {sysopen(FH,"$dir/f$f",O_CREAT); close(FH); $dir=3D($dir+1) % 50}' >|/dev/null perl -e >| /dev/null 2.77s user 38.31s system 85% cpu 47.829 total $ time find . |wc -l 500051 find . 0.16s user 0.36s system 29% cpu 1.787 total $ time find . -ls |wc -l 500051 find . -ls 1.75s user 11.33s system 92% cpu 14.196 total $ time find . -name f\* | xargs -n 100 rm find . -name f\* 0.17s user 0.35s system 0% cpu 3:23.44 total xargs -n 100 rm 1.35s user 52.82s system 26% cpu 3:23.75 total > > Could you for the sake of curiosity upgrate this system to the latest > 8-stable and retry it? I'm currently running 8.2-STABLE r223055. The log does not show anything particularly interesting committed to ZFS code since then. There was LBOLT overflow fix, but it should not be relevant in this case. I do plan to upgrade the box, though it's not going to happen for another week or so. If the issue is still relevant then, I'll be happy to re-run the test. --Artem From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 20:31:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88F0B106566C; Thu, 21 Jul 2011 20:31:48 +0000 (UTC) (envelope-from lists.br@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 303458FC13; Thu, 21 Jul 2011 20:31:48 +0000 (UTC) Received: by ywf7 with SMTP id 7so992260ywf.13 for ; Thu, 21 Jul 2011 13:31:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=subject:mime-version:content-type:from:in-reply-to:date:cc :content-transfer-encoding:message-id:references:to:x-mailer; bh=uiMAsqhbRreOTx4MSCwuHcM+ow655AuBIFp27FSIdCU=; b=pkVp3+65Ys83O/IrgbIkvdQAoSIBOqS/lZMo7XGSg6MKqbuqBm0VvSWIgH1P440Hjq C0h5Nf5BwVYpHOK4fx29c3zNLN+jOz/+3hSs1L6GyVbLe33Y0YXjcEDWCBkJz+sqbz6r H3s8UrOEjE6nDpaAMvR3mj+SbaPcPT8ipNxWc= Received: by 10.236.187.65 with SMTP id x41mr1084696yhm.449.1311280307655; Thu, 21 Jul 2011 13:31:47 -0700 (PDT) Received: from [192.168.0.53] ([187.120.139.136]) by mx.google.com with ESMTPS id c69sm1468028yhm.43.2011.07.21.13.31.45 (version=TLSv1/SSLv3 cipher=OTHER); Thu, 21 Jul 2011 13:31:47 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Luiz Otavio O Souza In-Reply-To: Date: Thu, 21 Jul 2011 17:31:42 -0300 Content-Transfer-Encoding: quoted-printable Message-Id: <0DBC88EC-7907-4DEE-8086-E3071590F800@gmail.com> References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> <5542D910-0C5C-4B2B-885F-CC92901367F0@gmail.com> To: Ivan Voras X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 20:31:48 -0000 On Jul 21, 2011, at 5:04 PM, Ivan Voras wrote: > The "load.." lines in between are me hitting Ctrl-T for process info. > Observe almost 20 seconds delay in the middle of "ls"! What do you have for the the first column on gstat (L(q) values) ? The server in question is a DELL R710 that comes as a poor choice on = bought (nobody notice the SATA disks until the server was installed - = then it was too late for a exchange). It was a lot worse at beginning and adding the following lines (to = /boot/loader.conf) did help a little bit (it is usable now, even when it = is fully loaded by find requests - usable but not normal -). vfs.zfs.vdev.min_pending=3D1 vfs.zfs.vdev.max_pending=3D1 Not sure if this will help with your hardware, but you can give it a try = if you're seeing high values for the L(q) column. Luiz= From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 20:34:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 632201065670 for ; Thu, 21 Jul 2011 20:34:52 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1C8488FC13 for ; Thu, 21 Jul 2011 20:34:51 +0000 (UTC) Received: by gwb15 with SMTP id 15so1427579gwb.13 for ; Thu, 21 Jul 2011 13:34:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=rRyj/ypCkYP4vi3XSBKky1J6um8JSeNuHhU1PaXvldY=; b=LqToevqQhR9MxffqUpXHf56A1u4K94Oo5M9XloT87WM9OY6lATm18WpL1jfoQo8eke xnA8qI21fDIzNOAnWsUfoSpQlxTPk+nMBA02YcM6vx1y57v4r3ShfU7DexVG3GxyxAuU +rgatr9NSrz3CJ3/BM4G3Pf2vxRDJZ15wFUJU= Received: by 10.101.62.3 with SMTP id p3mr771673ank.29.1311280491316; Thu, 21 Jul 2011 13:34:51 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 13:34:11 -0700 (PDT) In-Reply-To: <0DBC88EC-7907-4DEE-8086-E3071590F800@gmail.com> References: <13577F3E-DE59-44F4-98F7-9587E26499B8@gmail.com> <5542D910-0C5C-4B2B-885F-CC92901367F0@gmail.com> <0DBC88EC-7907-4DEE-8086-E3071590F800@gmail.com> From: Ivan Voras Date: Thu, 21 Jul 2011 22:34:11 +0200 X-Google-Sender-Auth: IuDbtlLFcolfX1Hatlbn0oJBDcI Message-ID: To: Luiz Otavio O Souza Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 20:34:52 -0000 On 21 July 2011 22:31, Luiz Otavio O Souza wrote: > On Jul 21, 2011, at 5:04 PM, Ivan Voras wrote: > >> The "load.." lines in between are =C2=A0me hitting Ctrl-T for process in= fo. >> Observe almost 20 seconds delay in the middle of "ls"! > > What do you have for the the first column on gstat (L(q) values) ? > > The server in question is a DELL R710 that comes as a poor choice on boug= ht (nobody notice the SATA disks until the server was installed - then it w= as too late for a exchange). > > It was a lot worse at beginning and adding the following lines (to /boot/= loader.conf) did help a little bit (it is usable now, even when it is fully= loaded by find requests - usable but not normal -). > > vfs.zfs.vdev.min_pending=3D1 > vfs.zfs.vdev.max_pending=3D1 > > Not sure if this will help with your hardware, but you can give it a try = if you're seeing high values for the L(q) column. Thanks for the tip but no, device queues are very short (1-2). From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 20:56:24 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DAD11065680; Thu, 21 Jul 2011 20:56:24 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (mail.vx.sk [IPv6:2a01:4f8:100:1043::3]) by mx1.freebsd.org (Postfix) with ESMTP id 0E5DC8FC17; Thu, 21 Jul 2011 20:56:24 +0000 (UTC) Received: from core.vx.sk (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 08E0215E020; Thu, 21 Jul 2011 22:56:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by core.vx.sk (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id FaRNsgMZ72U2; Thu, 21 Jul 2011 22:56:20 +0200 (CEST) Received: from [10.9.8.3] (chello085216231078.chello.sk [85.216.231.78]) by mail.vx.sk (Postfix) with ESMTPSA id B6DDA15E012; Thu, 21 Jul 2011 22:56:20 +0200 (CEST) Message-ID: <4E289284.5020800@FreeBSD.org> Date: Thu, 21 Jul 2011 22:56:36 +0200 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Ivan Voras References: <4E286F1F.6010502@FreeBSD.org> In-Reply-To: X-Enigmail-Version: 1.2 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 20:56:24 -0000 On 21. 7. 2011 21:40, Ivan Voras wrote: > Thank you very much - now if only you took as much effort to explain > the possible connection between your quote and my post as it took you > to find the quote :) > > As others explained, ZFS definitely does not use fixed block sizes I agree to that. I tried some more digging and stomped on this opensolaris mailing list thread: It starts here: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg35150.html With an interesting user report here (nice summary): http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg35189.html Most times they blame the way the client utilities work with directories (sorting etc.). Now to some more relevant options: It is also possible to do metadata-only caching for a dataset: zfs set primarycache=metadata L2 can be modified as well: zfs set secondarycache=metadata If I find some time I can run some simulations on this to see how it performs compared to primarycache=all. The vdev read-ahead cache might also have a negative impact here (lots of wasted IOPS), mostly if the blocks are spread around the vdev. We have followed what Illumos did and vdev cache is now disabled by default. I have updated the zfs-stats tool (ports: sysutils/zfs-stats) with latest Jason J. Hellenthal's arc_summary.pl, it gives a good overview of ZFS sysctl's: https://github.com/mmatuska/zfs-stats -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 21:36:50 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BD1271065693 for ; Thu, 21 Jul 2011 21:36:50 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta07.westchester.pa.mail.comcast.net (qmta07.westchester.pa.mail.comcast.net [76.96.62.64]) by mx1.freebsd.org (Postfix) with ESMTP id 7DF998FC21 for ; Thu, 21 Jul 2011 21:36:50 +0000 (UTC) Received: from omta24.westchester.pa.mail.comcast.net ([76.96.62.76]) by qmta07.westchester.pa.mail.comcast.net with comcast id AlYH1h0011ei1Bg57lcqMN; Thu, 21 Jul 2011 21:36:50 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta24.westchester.pa.mail.comcast.net with comcast id Alcn1h00N1t3BNj3klcob3; Thu, 21 Jul 2011 21:36:50 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id BBC91102C36; Thu, 21 Jul 2011 14:36:45 -0700 (PDT) Date: Thu, 21 Jul 2011 14:36:45 -0700 From: Jeremy Chadwick To: Ivan Voras Message-ID: <20110721213645.GA74462@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 21:36:50 -0000 On Thu, Jul 21, 2011 at 05:45:53PM +0200, Ivan Voras wrote: > I'm writing this mostly for future reference / archiving and also if > someone has an idea on how to improve the situation. > > A web server I maintain was hit by DoS, which has caused more than 4 > million PHP session files to be created. The session files are > sharded in 32 directories in a single level - which is normally more > than enough for this web server as the number of users is only a > couple of thousand. With the DoS, the number of files per shard > directory rose to about 130,000. > > The problem is: ZFS has proven horribly inefficient with such large > directories. I have other, more loaded servers with simlarly bad / > large directories on UFS where the problem is not nearly as serious > as here (probably due to the large dirhash). On this system, any > operation which touches even only the parent of these 32 shards > (e.g. "ls") takes seconds, and a simple "find | wc -l" on one of the > shards takes > 30 minutes (I stopped it after 30 minutes). Another > symptom is that SIGINT-ing such find process takes 10-15 seconds to > complete (sic! this likely means the kernel operation cannot be > interrupted for so long). > > This wouldn't be a problem by itself, but operations on such > directories eat IOPS - clearly visible with the "find" test case, > making the rest of the services on the server fall as collateral > damage. Apparently there is a huge amount of seeking being done, > even though I would think that for read operations all the data > would be cached - and somehow the seeking from this operation takes > priority / livelocks other operations on the same ZFS pool. > > This is on a fresh 8-STABLE AMD64, pool version 28 and zfs version 5. > > Is there an equivalent of UFS dirhash memory setting for ZFS? (i.e. > the size of the metadata cache) Ivan, This is in no way an attempt to divert attention from the real issue (bad performance with ZFS and lots of files), but PHP has configuration settings that can help auto-reap sessions sooner than letting them get up to 130,000. Taken from our configuration file: ; ; 25% of the time (prob/divisor) we'll try to clean up leftover ; cruft in save_path. Seems a lot of users enjoy leaving crusty ; session files laying around... ; [session] session.save_path = "/var/tmp/php_sessions" session.gc_maxlifetime = 900 session.gc_probability = 25 session.gc_divisor = 100 With the above settings, roughly 1 out of every 4 times (25%) the PHP interpreter is executed it will reap old files in save_path. So in your case you'd want to adjust gc_probability and gc_maxlifetime (the idea being to make PHP reap sessions more aggressively). Again: this doesn't solve the overall issue pertaining to ZFS, as there's a multitude of other ways to create hundreds of thousands of files on a system via a DoS. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, US | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 21:46:21 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 197501065670 for ; Thu, 21 Jul 2011 21:46:21 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id D5CAA8FC12 for ; Thu, 21 Jul 2011 21:46:20 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p6LLkIZB014182; Thu, 21 Jul 2011 16:46:18 -0500 (CDT) Date: Thu, 21 Jul 2011 16:46:18 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Martin Matuska In-Reply-To: <4E286F1F.6010502@FreeBSD.org> Message-ID: References: <4E286F1F.6010502@FreeBSD.org> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Thu, 21 Jul 2011 16:46:19 -0500 (CDT) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 21:46:21 -0000 On Thu, 21 Jul 2011, Martin Matuska wrote: > Quoting: > ... The default record size ZFS utilizes is 128K, which is good for many > storage servers that will harbor larger files. However, when dealing > with many files that are only a matter of tens of kilobytes, or even > bytes, considerable slowdown will result. ZFS can easily alter the I don't see how this can be. Short files are written as short blocks so the maximum block size should not matter. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 21:49:39 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0F2F2106564A for ; Thu, 21 Jul 2011 21:49:39 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id C340F8FC13 for ; Thu, 21 Jul 2011 21:49:38 +0000 (UTC) Received: by gwb15 with SMTP id 15so1458344gwb.13 for ; Thu, 21 Jul 2011 14:49:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; bh=sSH75PAVFV3t3h2fQhSQocaSE/jtcszhNXVTLZ24X1k=; b=qIAHGVpAGuKK9gT0QAdnf2OVCvLCCS7Jzc2e4kaPAMFES4IGkcQLH2pNY+1tqUcFJq z46c4wgvIDsOJkfPEQ7hrtSNCEiyiGswCal9v0qjKq5D7nlTttUJx7xBjO353/ocMm3U N20dljLoolXp+TRAe8JOrI/tFPOxjmLGdZIEw= Received: by 10.101.18.6 with SMTP id v6mr850850ani.39.1311284978128; Thu, 21 Jul 2011 14:49:38 -0700 (PDT) MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.100.198.5 with HTTP; Thu, 21 Jul 2011 14:48:58 -0700 (PDT) In-Reply-To: <20110721213645.GA74462@icarus.home.lan> References: <20110721213645.GA74462@icarus.home.lan> From: Ivan Voras Date: Thu, 21 Jul 2011 23:48:58 +0200 X-Google-Sender-Auth: uqDgWxRAHHACoAwmF5IlEgUmF7g Message-ID: To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 21:49:39 -0000 On 21 July 2011 23:36, Jeremy Chadwick wrote: > [session] > session.save_path =3D "/var/tmp/php_sessions" > session.gc_maxlifetime =3D 900 > session.gc_probability =3D 25 > session.gc_divisor =3D 100 > > With the above settings, roughly 1 out of every 4 times (25%) the PHP > interpreter is executed it will reap old files in save_path. =C2=A0So in = your > case you'd want to adjust gc_probability and gc_maxlifetime (the idea > being to make PHP reap sessions more aggressively). Yes, but no. Simply suppose my 4 million files are the result of 25% reaping frequency. From owner-freebsd-fs@FreeBSD.ORG Thu Jul 21 22:03:15 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 46839106566C for ; Thu, 21 Jul 2011 22:03:15 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id E45E18FC15 for ; Thu, 21 Jul 2011 22:03:14 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.14.4+Sun/8.14.4) with ESMTP id p6LM3EpP014255; Thu, 21 Jul 2011 17:03:14 -0500 (CDT) Date: Thu, 21 Jul 2011 17:03:14 -0500 (CDT) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: Freddie Cash In-Reply-To: Message-ID: References: <4E286F1F.6010502@FreeBSD.org> User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Thu, 21 Jul 2011 17:03:14 -0500 (CDT) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and large directories - caveat report X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Jul 2011 22:03:15 -0000 On Thu, 21 Jul 2011, Freddie Cash wrote: >> > The recordsize property in ZFS is the "max" block size used. It is not the > only block size used for a dataset. ZFS will use any block size from 0.5 KB > to $recordsize KB, as determined by the size of the file to be written (it > tries to the find the recordsize that most closely matches the file size to > use the least number of blocks per write). Except for tail blocks (last block in a file), the uncompressed data block size will always be the "max" block size. When compression is enabled, that "max" block size is likely to be reduced to something smaller (due to the compression), and zfs will use a smaller block size on disk. This approach minimizes the performance impact from fragmentation, copy on write (COW), and block metadata. It would not make sense for zfs to behave as you describe since files are written starting from scratch and so zfs has no knowledge of the final file size until it is completely written (and even then, more data could be written, or the file might be truncated). Zfs could have knowledge of a file size if the application did a seek to the ultimate length and wrote something, or used ftruncate to set the size, but the file size can still be arbitrarily changed. When raidzN is used, the data block is split into smaller chunks which are distributed among the disks. When mirroring is used, full blocks are written to each disk. It is important to realize that the zfs block checksum is for the uncompressed/unsplit original data block and not for some bit of data which eventually ended up on a disk. For example, when raidz is used, there is no independent checksum for the data chunks distributed across the disks. The zfs approach assures end-to-end validation and avoids having to recompute all data checksums (perhaps incorrectly) when doing 'zfs send'. Zfs metadata sizes are not related to the zfs block size. Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 00:38:24 2011 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 45BB0106564A for ; Fri, 22 Jul 2011 00:38:24 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from fallbackmx07.syd.optusnet.com.au (fallbackmx07.syd.optusnet.com.au [211.29.132.9]) by mx1.freebsd.org (Postfix) with ESMTP id 544D98FC16 for ; Fri, 22 Jul 2011 00:38:22 +0000 (UTC) Received: from mail11.syd.optusnet.com.au (mail11.syd.optusnet.com.au [211.29.132.192]) by fallbackmx07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p6LKp684018852 for ; Fri, 22 Jul 2011 06:51:06 +1000 Received: from server.vk2pj.dyndns.org (c220-239-116-103.belrs4.nsw.optusnet.com.au [220.239.116.103]) by mail11.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id p6LKp2IF009956 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 22 Jul 2011 06:51:04 +1000 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id p6LKp1uL072483; Fri, 22 Jul 2011 06:51:01 +1000 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id p6LKp0fY072482; Fri, 22 Jul 2011 06:51:00 +1000 (EST) (envelope-from peter) Date: Fri, 22 Jul 2011 06:51:00 +1000 From: Peter Jeremy To: "Vladislav V. Prodan" Message-ID: <20110721205100.GA72403@server.vk2pj.dyndns.org> References: <4E2412C2.5000202@ukr.net> <4E249FAF.4050500@ukr.net> <4E25F1C0.6060404@ukr.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="5vNYLRcllDrimb99" Content-Disposition: inline In-Reply-To: <4E25F1C0.6060404@ukr.net> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.21 (2010-09-15) Cc: fs@freebsd.org Subject: Re: [ZFS] Prompt, which is lost space in the ZFS pool? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 00:38:24 -0000 --5vNYLRcllDrimb99 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On 2011-Jul-20 00:06:08 +0300, "Vladislav V. Prodan" w= rote: >Now rewriting the hierarchy /backup with the option dedup=3Don to ZFSv28 >May be can save some space. I strongly suggest you do some reading up on the impact of dedup and perform extensive testing before enabling dedup. ZFS dedup places very high demands on RAM and can cause significant degradation of write and delete operations. At this point in time, it's probably only of use in very specific cases. --=20 Peter Jeremy --5vNYLRcllDrimb99 Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iEYEARECAAYFAk4okTQACgkQ/opHv/APuIcfygCfboQ2OOdQw/4LTEgNhhWEkEp6 lUQAnioh6e9hkdl2BZtnbXfKDg5WUCqf =ArFt -----END PGP SIGNATURE----- --5vNYLRcllDrimb99-- From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 04:10:12 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3EC02106564A for ; Fri, 22 Jul 2011 04:10:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 158678FC08 for ; Fri, 22 Jul 2011 04:10:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6M4ABpv096013 for ; Fri, 22 Jul 2011 04:10:11 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6M4ABtb096003; Fri, 22 Jul 2011 04:10:11 GMT (envelope-from gnats) Date: Fri, 22 Jul 2011 04:10:11 GMT Message-Id: <201107220410.p6M4ABtb096003@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Michael Haro Cc: Subject: Re: kern/159077: Can't cd .. with latest zfs version X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Michael Haro List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 04:10:12 -0000 The following reply was made to PR kern/159077; it has been noted by GNATS. From: Michael Haro To: Glen Barber Cc: FreeBSD-gnats-submit@freebsd.org Subject: Re: kern/159077: Can't cd .. with latest zfs version Date: Thu, 21 Jul 2011 20:43:28 -0700 --90e6ba53acd8ebd2a304a8a045f3 Content-Type: text/plain; charset=ISO-8859-1 On Thu, Jul 21, 2011 at 6:49 AM, Glen Barber wrote: > Hi, > > On 7/21/11 2:37 AM, Michael Haro wrote: > > $ zfs get version zroot/home > > NAME PROPERTY VALUE SOURCE > > zroot/home version 3 - > > $ zfs get version zroot/home/mharo > > NAME PROPERTY VALUE SOURCE > > zroot/home/mharo version 3 - > > > > Are your kernel and userland in sync? I believe with ZFS v28 the ZFS > version should be 4, not 3. May not be the problem though. > Kernel and userland were both build and installed on the same day; I just haven't run zfs upgrade yet. Is that required for cd .. to work? > > Regards, > > -- > Glen Barber > --90e6ba53acd8ebd2a304a8a045f3 Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable

On Thu, Jul 21, 2011 at 6:49 AM, Glen Ba= rber <glen.= j.barber@gmail.com> wrote:
Hi,

On 7/21/11 2:37 AM, Michael Haro wrote:
> $ zfs get version zroot/home
> NAME =A0 =A0 =A0 =A0PROPERTY =A0VALUE =A0 =A0SOURCE
> zroot/home =A0version =A0 3 =A0 =A0 =A0 =A0-
> $ zfs get version zroot/home/mharo
> NAME =A0 =A0 =A0 =A0 =A0 =A0 =A0PROPERTY =A0VALUE =A0 =A0SOURCE
> zroot/home/mharo =A0version =A0 3 =A0 =A0 =A0 =A0-
>

Are your kernel and userland in sync? =A0I believe with ZFS v28 the ZFS
version should be 4, not 3. =A0May not be the problem though.

Kernel and userland were both build and installed on= the same day; I just haven't run zfs upgrade yet. =A0Is that required = for cd .. to work?
=A0

Regards,

--
Glen Barber

--90e6ba53acd8ebd2a304a8a045f3-- From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 04:39:16 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8322A1065670 for ; Fri, 22 Jul 2011 04:39:16 +0000 (UTC) (envelope-from mainland@gmail.com) Received: from mail-yw0-f54.google.com (mail-yw0-f54.google.com [209.85.213.54]) by mx1.freebsd.org (Postfix) with ESMTP id 475F48FC14 for ; Fri, 22 Jul 2011 04:39:16 +0000 (UTC) Received: by ywf7 with SMTP id 7so1182133ywf.13 for ; Thu, 21 Jul 2011 21:39:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:from:date:x-google-sender-auth:message-id :subject:to:content-type; bh=lLOO/cMabn8GI0ewypv3x/Oc2PiJbgg5IwMAHOZXx3o=; b=CHDX+SxGDsTLiVs2BDGrHFGKTPN4swkjw6ZWvvzH+8Olzw2y8p5Uv0Px8iWIlgS36i BkUGWOSZNQk78zo+y6yx8iIIUzTk5iPwlINmSKvZFigKHVH7VXb3L7m45wWKldqK8q/W tsZAod7t7IOclQDDwiWttbDSHp6jkvh1zuDVM= Received: by 10.236.75.200 with SMTP id z48mr1432257yhd.518.1311307960347; Thu, 21 Jul 2011 21:12:40 -0700 (PDT) MIME-Version: 1.0 Sender: mainland@gmail.com Received: by 10.231.33.68 with HTTP; Thu, 21 Jul 2011 21:12:20 -0700 (PDT) From: Geoffrey Mainland Date: Fri, 22 Jul 2011 00:12:20 -0400 X-Google-Sender-Auth: VeTa7RRZ5lyAHar8sXf-9O_NqC8 Message-ID: To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Subject: Can't get 8.2-STABLE to boot from ZFS v28 with ashift=12 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 04:39:16 -0000 I'm trying to get a system up and running with a couple new WD EARS drives on 8.2-STABLE (built July 20, so it has ZFS v28 support). I'd like to be able to boot from a ZFS pool where I've used the gnop trick when creating the pool to make sure ZFS accesses my drive in 4k chunks. Booting from ZFS without using gnop (so ashift=9) works beautifully, but when I use the gnop trick to create the pool (ashfit=12), ZFS booting doesn't work at all. I have a script to build my pool with and without using gnop. Without gnop, I create the pool like this: zpool create $TANK gpt/${DISK0} gpt/${DISK1} With gnop, the pool is created like this: gnop create -S 4096 gpt/${DISK0} gnop create -S 4096 gpt/${DISK1} zpool create $TANK gpt/${DISK0}.nop gpt/${DISK1}.nop zpool export $TANK gnop destroy gpt/${DISK0}.nop gnop destroy gpt/${DISK1}.nop zpool import $TANK Other than this, the pools are set up identically by the script. Without gnop I'm golden. With gnop, on boot I get the spinner and then the computer reboots (it looks like boot0 fails). Any ideas what might be going on? Thanks, Geoff From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 05:35:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4D2681065670 for ; Fri, 22 Jul 2011 05:35:33 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-wy0-f182.google.com (mail-wy0-f182.google.com [74.125.82.182]) by mx1.freebsd.org (Postfix) with ESMTP id D87398FC08 for ; Fri, 22 Jul 2011 05:35:32 +0000 (UTC) Received: by wyg24 with SMTP id 24so1679816wyg.13 for ; Thu, 21 Jul 2011 22:35:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=OXN/Jn+BATndHwFVCvdNfEia4U9a4WUA1R0RL6loQx8=; b=T7Uiqj0Kn5dBkRySFYYQdXUsRXDmmNsAu/bZKc+YKjC+MlNcqYsdGRS0mL/GLzGOe7 h2JE3/TsDQ8OC6ZLI/1RgMq8Vu1PIP7QlVqotcqTQ+KLnv2kwU243rcjnGm5Iwb3hgDA OT01xAhWE8TkPkClhoMBK7b2p8sLtiPVp8yQ0= MIME-Version: 1.0 Received: by 10.216.137.4 with SMTP id x4mr1460924wei.53.1311312931608; Thu, 21 Jul 2011 22:35:31 -0700 (PDT) Sender: artemb@gmail.com Received: by 10.216.9.8 with HTTP; Thu, 21 Jul 2011 22:35:31 -0700 (PDT) In-Reply-To: References: Date: Thu, 21 Jul 2011 22:35:31 -0700 X-Google-Sender-Auth: uH0tlK0pVWnHnCxTzRKlUJHEZqk Message-ID: From: Artem Belevich To: Geoffrey Mainland Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: Can't get 8.2-STABLE to boot from ZFS v28 with ashift=12 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 05:35:33 -0000 On Thu, Jul 21, 2011 at 9:12 PM, Geoffrey Mainland wrote: > I'm trying to get a system up and running with a couple new WD EARS > drives on 8.2-STABLE (built July 20, so it has ZFS v28 support). I'd > like to be able to boot from a ZFS pool where I've used the gnop trick > when creating the pool to make sure ZFS accesses my drive in 4k chunks. > Booting from ZFS without using gnop (so ashift=9) works beautifully, but > when I use the gnop trick to create the pool (ashfit=12), ZFS booting > doesn't work at all. I have a script to build my pool with and without > using gnop. Without gnop, I create the pool like this: > > zpool create $TANK gpt/${DISK0} gpt/${DISK1} > > With gnop, the pool is created like this: > > gnop create -S 4096 gpt/${DISK0} > gnop create -S 4096 gpt/${DISK1} > zpool create $TANK gpt/${DISK0}.nop gpt/${DISK1}.nop > > zpool export $TANK > > gnop destroy gpt/${DISK0}.nop > gnop destroy gpt/${DISK1}.nop > > zpool import $TANK > > Other than this, the pools are set up identically by the script. > > Without gnop I'm golden. With gnop, on boot I get the spinner and then > the computer reboots (it looks like boot0 fails). > > Any ideas what might be going on? Did you install updated bootloader, too? If not, you probably will have trouble booting. --Artem > > Thanks, > Geoff > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 05:55:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28FAB106564A; Fri, 22 Jul 2011 05:55:36 +0000 (UTC) (envelope-from mainland@gmail.com) Received: from mail-iy0-f182.google.com (mail-iy0-f182.google.com [209.85.210.182]) by mx1.freebsd.org (Postfix) with ESMTP id DCEFB8FC16; Fri, 22 Jul 2011 05:55:35 +0000 (UTC) Received: by iyb11 with SMTP id 11so2149266iyb.13 for ; Thu, 21 Jul 2011 22:55:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type; bh=kMgN2hWahDhl3TwfdDSBqlm1NmVhOuIY34WJUQK6BNU=; b=jAygS5uIua7EMNW0UI6MwDk9VbxFhhLPNSa/HjHvnewCXrQ3uS2Mmi0EVejTyNlBEE 55dkXkrvc4lvapd2knXVcwZSQ69b4oPb2oTvykW7AH7KEXq8FG8ZK0QZevsxKsODFnxb q/U+jFAFWHwotdGEwiO5WxL1OIeR0SEBLOh/c= Received: by 10.231.116.67 with SMTP id l3mr972938ibq.175.1311314135276; Thu, 21 Jul 2011 22:55:35 -0700 (PDT) MIME-Version: 1.0 Sender: mainland@gmail.com Received: by 10.231.33.68 with HTTP; Thu, 21 Jul 2011 22:55:15 -0700 (PDT) In-Reply-To: References: From: Geoffrey Mainland Date: Fri, 22 Jul 2011 01:55:15 -0400 X-Google-Sender-Auth: 9EDDUQ1JPKVWqQssFCU28VIoHKU Message-ID: To: Artem Belevich Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: Can't get 8.2-STABLE to boot from ZFS v28 with ashift=12 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 05:55:36 -0000 On Fri, Jul 22, 2011 at 1:35 AM, Artem Belevich wrote: > On Thu, Jul 21, 2011 at 9:12 PM, Geoffrey Mainland wrote: >> I'm trying to get a system up and running with a couple new WD EARS >> drives on 8.2-STABLE (built July 20, so it has ZFS v28 support). I'd >> like to be able to boot from a ZFS pool where I've used the gnop trick >> when creating the pool to make sure ZFS accesses my drive in 4k chunks. >> Booting from ZFS without using gnop (so ashift=9) works beautifully, but >> when I use the gnop trick to create the pool (ashfit=12), ZFS booting >> doesn't work at all. I have a script to build my pool with and without >> using gnop. Without gnop, I create the pool like this: >> >> zpool create $TANK gpt/${DISK0} gpt/${DISK1} >> >> With gnop, the pool is created like this: >> >> gnop create -S 4096 gpt/${DISK0} >> gnop create -S 4096 gpt/${DISK1} >> zpool create $TANK gpt/${DISK0}.nop gpt/${DISK1}.nop >> >> zpool export $TANK >> >> gnop destroy gpt/${DISK0}.nop >> gnop destroy gpt/${DISK1}.nop >> >> zpool import $TANK >> >> Other than this, the pools are set up identically by the script. >> >> Without gnop I'm golden. With gnop, on boot I get the spinner and then >> the computer reboots (it looks like boot0 fails). >> >> Any ideas what might be going on? > > Did you install updated bootloader, too? If not, you probably will > have trouble booting. Yes, I absolutely did use gpart to install an up-to-date bootloader. Geoff From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 13:40:12 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 73D48106564A for ; Fri, 22 Jul 2011 13:40:12 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 63BF08FC16 for ; Fri, 22 Jul 2011 13:40:12 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6MDeCBn060228 for ; Fri, 22 Jul 2011 13:40:12 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6MDeCWM060227; Fri, 22 Jul 2011 13:40:12 GMT (envelope-from gnats) Date: Fri, 22 Jul 2011 13:40:12 GMT Message-Id: <201107221340.p6MDeCWM060227@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Glen Barber Cc: Subject: Re: kern/159077: Can't cd .. with latest zfs version X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Glen Barber List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 13:40:12 -0000 The following reply was made to PR kern/159077; it has been noted by GNATS. From: Glen Barber To: Michael Haro Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/159077: Can't cd .. with latest zfs version Date: Fri, 22 Jul 2011 09:39:12 -0400 On 7/21/11 11:43 PM, Michael Haro wrote: >> Are your kernel and userland in sync? I believe with ZFS v28 the ZFS >> version should be 4, not 3. May not be the problem though. >> > > Kernel and userland were both build and installed on the same day; I just > haven't run zfs upgrade yet. Is that required for cd .. to work? > I wouldn't expect it to be, but perhaps that is the cause of the problem you are seeing. -- Glen Barber From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 17:54:42 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9D1A01065674 for ; Fri, 22 Jul 2011 17:54:42 +0000 (UTC) (envelope-from freebsd@deman.com) Received: from plato.corp.nas.com (plato.corp.nas.com [66.114.32.138]) by mx1.freebsd.org (Postfix) with ESMTP id 6073C8FC1B for ; Fri, 22 Jul 2011 17:54:42 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by plato.corp.nas.com (Postfix) with ESMTP id B06BBE9C5F19; Fri, 22 Jul 2011 10:54:41 -0700 (PDT) X-Virus-Scanned: amavisd-new at corp.nas.com Received: from plato.corp.nas.com ([127.0.0.1]) by localhost (plato.corp.nas.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K95OJQxHTja6; Fri, 22 Jul 2011 10:54:41 -0700 (PDT) Received: from [192.168.196.147] (unknown [192.168.196.147]) by plato.corp.nas.com (Postfix) with ESMTPSA id 0F31CE9C5F0D; Fri, 22 Jul 2011 10:54:41 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Michael DeMan In-Reply-To: <20110721044038.GA57436@icarus.home.lan> Date: Fri, 22 Jul 2011 10:54:39 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: <0542453F-C96F-45D8-8697-4D181DFED03D@deman.com> References: <188D255F-B83A-4B9A-89AF-9BF58050F816@deman.com> <20110721044038.GA57436@icarus.home.lan> To: Jeremy Chadwick X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: Marvell 88SX6081 timeouts, particularly when running 'zfs scrub' with regular I/O X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 17:54:42 -0000 Hi Jeremy, Info is: FreeBSD freenas.n.bli.openaccess.org 8.2-RELEASE-p2 FreeBSD = 8.2-RELEASE-p2 #0: Tue Jul 12 12:11:37 PDT 2011 = jpaetzel@servant.iXsystems.com:/b/home/jpaetzel/sf_freenas_build/obj.amd64= /b/home/jpaetzel/sf_freenas_build/FreeBSD/src/sys/FREENAS.amd64 amd64 Also, the problem I am seeing with the mvs driver seems suspiciously = close to a problem with ahci driver that is patched in 9-CURRENT = apparently by disabling NCQ? http://forums.freebsd.org/showthread.php?t=3D20412 The problem only occurs under high I/O - basically I can force it to = occur by kicking off a zfs scrub while there is also about 300 IOPS of = NFS traffic at the same time. On Jul 20, 2011, at 9:40 PM, Jeremy Chadwick wrote: > On Wed, Jul 20, 2011 at 08:59:32PM -0700, Michael DeMan wrote: >> I've found a few posts around about this, but nothing conclusive. >>=20 >> We have been getting hit on this with two... >> mvs0: port 0x9400-0x94ff mem = 0xfc400000-0xfc4fffff irq 28 at device 1.0 on pci1 >> mvs1: port 0x9800-0x98ff mem = 0xfc500000-0xfc5fffff irq 29 at device 3.0 on pci1 >> ...controllers. >>=20 >> I went through and did a few things (an older Opteron 285 box) and = disabled super-pages and permutations on other device.hints, loader.conf = and live sysctl settings - all to no avail. >>=20 >> I also found a few things via Google about being to patch from = 9-CURRENT, but the idea with this box was to be able to re-purpose some = older equipment for proof of concept using FreeNAS8. >>=20 >> It is possible for me to build a version of that with the patches, = etc - but I figured it would be better to post to the list first and = gather feedback since this is pretty old/clunky hardware and newer = patches may or may not solve the problem. >>=20 >> Thanks, >>=20 >> - mike deman >>=20 >>=20 >>=20 >> Jul 19 16:46:41 freenas kernel: mvsch11: Timeout on slot 0 >> Jul 19 16:46:41 freenas kernel: mvsch11: iec 02000000 sstat 00000123 = serr 00000000 edma_s 00001023 dma_c 00000000 dma_s 00000000 rs 00000201 = status 40 >> Jul 19 16:46:41 freenas kernel: mvsch11: ... waiting for slots = 00000200 >> Jul 19 16:46:43 freenas kernel: mvsch4: Timeout on slot 4 >> Jul 19 16:46:43 freenas kernel: mvsch4: iec 02000000 sstat 00000123 = serr 00000000 edma_s 00001022 dma_c 00000000 dma_s 00000000 rs 00000010 = status 40 >> Jul 19 16:46:45 freenas kernel: mvsch1: Timeout on slot 0 >> Jul 19 16:46:45 freenas kernel: mvsch1: iec 02000000 sstat 00000123 = serr 00000000 edma_s 00001101 dma_c 00000000 dma_s 00000000 rs 00000001 = status 40 >> Jul 19 16:46:46 freenas kernel: mvsch11: Timeout on slot 9 >> Jul 19 16:46:46 freenas kernel: mvsch11: iec 02000000 sstat 00000123 = serr 00000000 edma_s 00001023 dma_c 00000000 dma_s 00000000 rs 00000201 = status 40 >> Jul 19 16:46:47 freenas root: ZFS: checksum mismatch, zpool=3Dzpmir1 = path=3D/dev/gpt/ada12 offset=3D52341838336 size=3D131072 >=20 > You didn't disclose what FreeBSD version you're running and what your > kernel/world build date is. It matters greatly since it will then = give > us some idea what exact source revision you're using for the mvs(4) > driver. "uname -a" output is sufficient. >=20 > Thanks. >=20 > --=20 > | Jeremy Chadwick jdc at parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, US | > | Making life hard for others since 1977. PGP 4BD6C0CB | >=20 From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 22:16:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 032F3106564A for ; Fri, 22 Jul 2011 22:16:51 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id B19BD8FC14 for ; Fri, 22 Jul 2011 22:16:50 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap4EAFP2KU6DaFvO/2dsb2JhbABTG4Qxo3eJAKpjkHqFMIEPBJJuiDGISg X-IronPort-AV: E=Sophos;i="4.67,249,1309752000"; d="c'?scan'208";a="132004980" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 22 Jul 2011 18:16:49 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id B9430B3F66; Fri, 22 Jul 2011 18:16:49 -0400 (EDT) Date: Fri, 22 Jul 2011 18:16:49 -0400 (EDT) From: Rick Macklem To: Clinton Adams Message-ID: <1730895125.912894.1311373009726.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_912893_1681470114.1311373009724" X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: FreeBSD FS Subject: Re: nfsd server cache flooded, try to increase nfsrc_floodlevel X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 22:16:51 -0000 ------=_Part_912893_1681470114.1311373009724 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Clinton Adams wrote: [stuff snipped for brevity] > > Running four clients now and the LockOwners are steadily climbing, > nfsstat consistently reported it as 0 prior to users logging into the > nfsv4 test systems - my testing via ssh didn't show anything like > this. Attached tcpdump file is from when I first noticed the jump in > LockOwners from 0 to ~600. I tried wireshark on this and didn't see > any releaselockowner operations. > [stuff snipped for brevity] > OpenOwner Opens LockOwner Locks Delegs > 6 242 2481 22 0 > Server Cache Stats: > Inprog Idem Non-idem Misses CacheSize TCPPeak > 0 0 2 2518251 2502 4772 > I've written a small test program: http://people.freebsd.org/~rmacklem/childlock.c (also attached) where a parent process opens a file and then forks children that do lock ops and then exit. (I'm guessing that this is what some process in your clients are doing, that result in the LockOwner count growing.) When I run this program on Fedora15, it generates ReleaseLockOwner Ops and the LockOwner count doesn't increase as it runs. You can run this program by giving it an argument that can be any file on the nfsv4 mount for which you have read/write access, then watch the server via "nfsstat -e -s" to see if the LockOwner count increases. If the LockOwner count does increase, then it appears that a newer Linux kernel will avoid the problem. If you are interested in what the packet trace looks like when running the program on Fedora15, it's at: http://people.freebsd.org/~rmacklem/childlock.pcap rick ps: The FreeBSD NFSv4 client doesn't currently generate the ReleaseLockOwner Ops for this case either. I need to come up with a patch that does that. ------=_Part_912893_1681470114.1311373009724-- From owner-freebsd-fs@FreeBSD.ORG Fri Jul 22 21:48:29 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D71E9106566B for ; Fri, 22 Jul 2011 21:48:29 +0000 (UTC) (envelope-from grarpamp@gmail.com) Received: from mail-gw0-f54.google.com (mail-gw0-f54.google.com [74.125.83.54]) by mx1.freebsd.org (Postfix) with ESMTP id 9E7228FC15 for ; Fri, 22 Jul 2011 21:48:29 +0000 (UTC) Received: by gwb15 with SMTP id 15so2282920gwb.13 for ; Fri, 22 Jul 2011 14:48:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; bh=B2R+azWvDZegosolU8HC16eZtc9Q9NUGADWhEuYpbH0=; b=jHm6oQVHehDc45YKxGJhQ3Bx4bhrrRcYCfoFdkDtaP4HrTm9TJGlxAWJss1hgX7Xeq C4/q2XGhv0P/AN2L6Mpr74q2F6BtRLFT8MMoWvcCrgksx+aGnEGaEVHhfXYkjjd49a9e c1h/DRhFMREznbSaHjZFHt3GiQTBK4Eg4jD10= MIME-Version: 1.0 Received: by 10.142.120.1 with SMTP id s1mr1078816wfc.252.1311369762553; Fri, 22 Jul 2011 14:22:42 -0700 (PDT) Received: by 10.142.201.1 with HTTP; Fri, 22 Jul 2011 14:22:42 -0700 (PDT) Date: Fri, 22 Jul 2011 17:22:42 -0400 Message-ID: From: grarpamp To: freebsd-hardware@freebsd.org Content-Type: text/plain; charset=UTF-8 X-Mailman-Approved-At: Fri, 22 Jul 2011 23:12:37 +0000 Cc: Subject: Silicon Image programming docs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Jul 2011 21:48:29 -0000 Found some datasheets (programming docs) and board schematics for Silicon Image storage controllers. Since they don't seem to be publicly available, perhaps some of these docs will be useful. Bcc'd relevant fs and hackers lists. Reply to hardware I guess. # Overview http://www.siliconimage.com/products/family.aspx?id=3 http://www.siliconimage.com/docs/Site_PDFs/Product_Guide_3Q_2011.pdf # SiI0680 http://www.siliconimage.com/docs/SiI-DS-0069-C.pdf http://www.siliconimage.com/docs/SiI-SC-0094-A.PDF # SiI3114 http://www.siliconimage.com/docs/SiI-DS-0103-D.pdf http://www.siliconimage.com/docs/SiI-SC-0057-B.PDF # SiI3124 http://www.siliconimage.com/docs/SiI-DS-0160-C.pdf http://www.siliconimage.com/docs/312404_B00_092704.PDF # SiI3132 http://www.siliconimage.com/docs/SiI-DS-0136-B.pdf http://www.siliconimage.com/docs/SiI-DS-0138-E.pdf http://www.siliconimage.com/docs/SiI-SC-0097_doc.pdf # SiI3512 http://www.siliconimage.com/docs/SiI-DS-0102-D.pdf http://www.siliconimage.com/docs/SiI-DS-0107-C.pdf http://www.siliconimage.com/docs/SiI-SC-0056-B.PDF # SiI3531 http://www.siliconimage.com/docs/SiI-DS-0208-C.pdf http://www.siliconimage.com/docs/SiI-SC-215.PDF # SiI3726 http://www.siliconimage.com/docs/FINAL%20SiI3726%20Product%20Brief_02-16-2011.pdf http://www.siliconimage.com/docs/SiI-DS-0121-C1.pdf # SiI3811 http://www.siliconimage.com/docs/SiI3811_PB58_FINAL_8-17-06.pdf http://www.siliconimage.com/docs/SiI-SC-0231-A.PDF From owner-freebsd-fs@FreeBSD.ORG Sat Jul 23 09:30:15 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8AF921065673 for ; Sat, 23 Jul 2011 09:30:15 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 617DC8FC0A for ; Sat, 23 Jul 2011 09:30:15 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6N9UFfE080191 for ; Sat, 23 Jul 2011 09:30:15 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6N9UFR6080188; Sat, 23 Jul 2011 09:30:15 GMT (envelope-from gnats) Date: Sat, 23 Jul 2011 09:30:15 GMT Message-Id: <201107230930.p6N9UFR6080188@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Martin Matuska Cc: Subject: Re: kern/159077: [zfs] Can't cd .. with latest zfs version X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Martin Matuska List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jul 2011 09:30:15 -0000 The following reply was made to PR kern/159077; it has been noted by GNATS. From: Martin Matuska To: bug-followup@FreeBSD.org, mharo@freebsd.org Cc: Subject: Re: kern/159077: [zfs] Can't cd .. with latest zfs version Date: Sat, 23 Jul 2011 11:28:55 +0200 It should work with zpl version 3 (no need to do zfs upgrade) and I have never seen this before. Is there any way to reproduce this or take a closer look? -- Martin Matuska FreeBSD committer http://blog.vx.sk From owner-freebsd-fs@FreeBSD.ORG Sat Jul 23 12:10:14 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C00CD106566C for ; Sat, 23 Jul 2011 12:10:14 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 966178FC1F for ; Sat, 23 Jul 2011 12:10:14 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p6NCAE5p030282 for ; Sat, 23 Jul 2011 12:10:14 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p6NCAEI7030281; Sat, 23 Jul 2011 12:10:14 GMT (envelope-from gnats) Date: Sat, 23 Jul 2011 12:10:14 GMT Message-Id: <201107231210.p6NCAEI7030281@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Fabian Keil Cc: Subject: Re: kern/159077: [zfs] Can't cd .. with latest zfs version X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Fabian Keil List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Jul 2011 12:10:14 -0000 The following reply was made to PR kern/159077; it has been noted by GNATS. From: Fabian Keil To: bug-followup@FreeBSD.org, mharo@freebsd.org Cc: Subject: Re: kern/159077: [zfs] Can't cd .. with latest zfs version Date: Sat, 23 Jul 2011 13:43:26 +0200 --Sig_/Dci3z9B3kZCsY_I=wEyT+/a Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Do you have the problem in every directory or only in some? Can you "cd .." as root? Does the '..' link exist and is it accessible? I ask because the problem looks familiar to me: fk@r500 ~/git/privoxy $/usr/bin/cd .. cd: ..: No such file or directory fk@r500 ~/git/privoxy $stat -x .. stat: ..: stat: Permission denied fk@r500 ~/git/privoxy $sudo stat -x .. File: ".." Size: 33 FileType: Directory Mode: (0755/drwxr-xr-x) Uid: ( 1001/ fk) Gid: ( 1001/ = fk) Device: 113,472580146 Inode: 3 Links: 33 Access: Fri Mar 12 23:05:31 2010 Modify: Sat Jul 23 13:26:27 2011 Change: Sat Jul 23 13:26:27 2011 fk@r500 ~/git/privoxy $stat -x ~/git/ File: "/home/fk/git/" Size: 33 FileType: Directory Mode: (0755/drwxr-xr-x) Uid: ( 1001/ fk) Gid: ( 1001/ = fk) Device: 113,472580146 Inode: 3 Links: 33 Access: Fri Mar 12 23:05:31 2010 Modify: Sat Jul 23 13:26:27 2011 Change: Sat Jul 23 13:26:27 2011 fk@r500 ~/git/privoxy $stat -x ~/git/curl/.. File: "/home/fk/git/curl/.." Size: 33 FileType: Directory Mode: (0755/drwxr-xr-x) Uid: ( 1001/ fk) Gid: ( 1001/ = fk) Device: 113,472580146 Inode: 3 Links: 33 Access: Fri Mar 12 23:05:31 2010 Modify: Sat Jul 23 13:26:27 2011 Change: Sat Jul 23 13:26:27 2011 fk@r500 ~/git/privoxy $stat -x ~/git/privoxy/.. stat: /home/fk/git/privoxy/..: stat: Permission denied So somehow as a user I'm allowed to access "~/git" but not "~/git/privoxy/.= .". bash's builtin cd doesn't seem to use '..', so it continues to work. ~/git, ~/git/privoxy and ~/git/curl are different datasets. Scrubbing the pool doesn't show any issues. If I send/receive a snapshot of ~/git/privoxy, the copy doesn't have the problem. As far as I know, I only have the problem with "~/git/privoxy/..". The problem survived several ZFS updates so far, and at least in may case it's neither a regression in ZFSv28 nor serious. Fabian --Sig_/Dci3z9B3kZCsY_I=wEyT+/a Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.17 (FreeBSD) iEYEARECAAYFAk4qs+QACgkQBYqIVf93VJ0UvgCgrKRZDFGC+2/ieE7wBasQXg96 QIAAn0uj2tj8BF+X9QpLBWSL8llfKd23 =4CRh -----END PGP SIGNATURE----- --Sig_/Dci3z9B3kZCsY_I=wEyT+/a--