From owner-freebsd-fs@FreeBSD.ORG Sun Jun 20 13:36:46 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 91C12106566B; Sun, 20 Jun 2010 13:36:46 +0000 (UTC) (envelope-from kib@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6893B8FC08; Sun, 20 Jun 2010 13:36:46 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o5KDak6R059714; Sun, 20 Jun 2010 13:36:46 GMT (envelope-from kib@freefall.freebsd.org) Received: (from kib@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o5KDakDY059710; Sun, 20 Jun 2010 13:36:46 GMT (envelope-from kib) Date: Sun, 20 Jun 2010 13:36:46 GMT Message-Id: <201006201336.o5KDakDY059710@freefall.freebsd.org> To: lynx.ripe@gmail.com, kib@FreeBSD.org, freebsd-fs@FreeBSD.org, kib@FreeBSD.org From: kib@FreeBSD.org Cc: Subject: Re: kern/147890: [ufs] [regression] ufs-related lock problem in RELENG_8 (18.04.2010 -> 20.04.2010) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 Jun 2010 13:36:46 -0000 Synopsis: [ufs] [regression] ufs-related lock problem in RELENG_8 (18.04.2010 -> 20.04.2010) State-Changed-From-To: open->patched State-Changed-By: kib State-Changed-When: Sun Jun 20 13:36:07 UTC 2010 State-Changed-Why: No reason not to grab the bug, since I already committed the patch. Responsible-Changed-From-To: freebsd-fs->kib Responsible-Changed-By: kib Responsible-Changed-When: Sun Jun 20 13:36:07 UTC 2010 Responsible-Changed-Why: No reason not to grab the bug, since I already committed the patch. http://www.freebsd.org/cgi/query-pr.cgi?pr=147890 From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 03:30:34 2010 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9EF78106566B; Mon, 21 Jun 2010 03:30:34 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 764E88FC20; Mon, 21 Jun 2010 03:30:34 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o5L3UYxA071633; Mon, 21 Jun 2010 03:30:34 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o5L3UYZn071623; Mon, 21 Jun 2010 03:30:34 GMT (envelope-from linimon) Date: Mon, 21 Jun 2010 03:30:34 GMT Message-Id: <201006210330.o5L3UYZn071623@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/147903: [zfs] [panic] Kernel panics on faulty zfs device X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 03:30:34 -0000 Old Synopsis: Kernel panics on faulty zfs device New Synopsis: [zfs] [panic] Kernel panics on faulty zfs device Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Jun 21 03:30:16 UTC 2010 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=147903 From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 11:06:54 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 769F41065675 for ; Mon, 21 Jun 2010 11:06:54 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6450A8FC08 for ; Mon, 21 Jun 2010 11:06:54 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id o5LB6s1D098233 for ; Mon, 21 Jun 2010 11:06:54 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id o5LB6rSH098231 for freebsd-fs@FreeBSD.org; Mon, 21 Jun 2010 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 21 Jun 2010 11:06:53 GMT Message-Id: <201006211106.o5LB6rSH098231@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 11:06:54 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/147292 fs [nfs] [patch] readahead missing in nfs client options o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server o kern/146375 fs [nfs] [patch] Typos in macro variables names in sys/fs o kern/145778 fs [zfs] [panic] panic in zfs_fuid_map_id (known issue fi s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat s kern/145424 fs [zfs] [patch] move source closer to v15 o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o kern/145309 fs [disklabel]: Editing disk label invalidates the whole o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c o kern/144458 fs [nfs] [patch] nfsd fails as a kld p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o kern/143345 fs [ext2fs] [patch] extfs minor header cleanups to better o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142924 fs [ext2fs] [patch] Small cleanup for the inode struct in o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142401 fs [ntfs] [patch] Minor updates to NTFS from NetBSD o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/53137 fs [ffs] [panic] background fscking causing ffs_valloc pa o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/33464 fs [ufs] soft update inconsistencies after system crash o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 176 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 13:26:52 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F36831065676 for ; Mon, 21 Jun 2010 13:26:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 25F618FC1E for ; Mon, 21 Jun 2010 13:26:50 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5LCwPqo033551 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 21 Jun 2010 15:58:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5LCwPjx078808; Mon, 21 Jun 2010 15:58:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5LCwPXx078807; Mon, 21 Jun 2010 15:58:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 21 Jun 2010 15:58:25 +0300 From: Kostik Belousov To: fs@freebsd.org Message-ID: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="zd5GkkQQtETumrwc" Content-Disposition: inline User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, pho@freebsd.org Subject: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 13:26:52 -0000 --zd5GkkQQtETumrwc Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi, Below is the patch that eliminates second copy of the data kept by tmpfs in case a file is mapped. Also, it removes potential deadlocks due to tmpfs doing copyin/out while page is busy. It is possible that patch also fixes known issue with sendfile(2) of tmpfs file, but I did not verified this. Patch essentially consists of three parts: - move of vm_object' vnp_size from the type-discriminated union to the vm_object proper; - making vm not choke when vm object held in the struct vnode' v_object is default or swap object instead of vnode object; - use of the swap object that keeps data for tmpfs VREG file, also as v_object. Peter Holm helped me with the patch, apparently we survive fsx and stress2. diff --git a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c b/s= ys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c index adeabfb..0cfe0d9 100644 --- a/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c +++ b/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c @@ -339,7 +339,7 @@ again: =20 if (vm_page_sleep_if_busy(m, FALSE, "zfsmwb")) goto again; - fsize =3D obj->un_pager.vnp.vnp_size; + fsize =3D obj->vnp_size; vm_page_busy(m); vm_page_lock_queues(); vm_page_undirty(m); diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c index b6c5cfe..7297f5a 100644 --- a/sys/fs/tmpfs/tmpfs_subr.c +++ b/sys/fs/tmpfs/tmpfs_subr.c @@ -379,13 +379,17 @@ loop: /* FALLTHROUGH */ case VLNK: /* FALLTHROUGH */ - case VREG: - /* FALLTHROUGH */ case VSOCK: break; case VFIFO: vp->v_op =3D &tmpfs_fifoop_entries; break; + case VREG: + VI_LOCK(vp); + KASSERT(vp->v_object =3D=3D NULL, ("Not NULL v_object in tmpfs")); + vp->v_object =3D node->tn_reg.tn_aobj; + VI_UNLOCK(vp); + break; case VDIR: MPASS(node->tn_dir.tn_parent !=3D NULL); if (node->tn_dir.tn_parent =3D=3D node) @@ -396,7 +400,6 @@ loop: panic("tmpfs_alloc_vp: type %p %d", node, (int)node->tn_type); } =20 - vnode_pager_setsize(vp, node->tn_size); error =3D insmntque(vp, mp); if (error) vp =3D NULL; @@ -849,11 +852,13 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct ui= o *uio, off_t *cntp) int tmpfs_reg_resize(struct vnode *vp, off_t newsize) { - int error; - size_t newpages, oldpages; struct tmpfs_mount *tmp; struct tmpfs_node *node; + vm_object_t uobj; + vm_page_t m; off_t oldsize; + size_t newpages, oldpages, zerolen; + int error; =20 MPASS(vp->v_type =3D=3D VREG); MPASS(newsize >=3D 0); @@ -883,41 +888,38 @@ tmpfs_reg_resize(struct vnode *vp, off_t newsize) TMPFS_UNLOCK(tmp); =20 node->tn_size =3D newsize; - vnode_pager_setsize(vp, newsize); + uobj =3D node->tn_reg.tn_aobj; + VM_OBJECT_LOCK(uobj); if (newsize < oldsize) { - size_t zerolen =3D round_page(newsize) - newsize; - vm_object_t uobj =3D node->tn_reg.tn_aobj; - vm_page_t m; - /* * free "backing store" */ - VM_OBJECT_LOCK(uobj); if (newpages < oldpages) { - swap_pager_freespace(uobj, - newpages, oldpages - newpages); - vm_object_page_remove(uobj, - OFF_TO_IDX(newsize + PAGE_MASK), 0, FALSE); + swap_pager_freespace(uobj, newpages, oldpages - + newpages); + vm_object_page_remove(uobj, OFF_TO_IDX(newsize + + PAGE_MASK), 0, FALSE); } =20 /* * zero out the truncated part of the last page. */ - + zerolen =3D round_page(newsize) - newsize; if (zerolen > 0) { m =3D vm_page_grab(uobj, OFF_TO_IDX(newsize), VM_ALLOC_NOBUSY | VM_ALLOC_NORMAL | VM_ALLOC_RETRY); pmap_zero_page_area(m, PAGE_SIZE - zerolen, zerolen); } - VM_OBJECT_UNLOCK(uobj); - } + uobj->size =3D newpages; + uobj->vnp_size =3D newsize; + VM_OBJECT_UNLOCK(uobj); =20 error =3D 0; =20 out: - return error; + return (error); } =20 /* --------------------------------------------------------------------- */ diff --git a/sys/fs/tmpfs/tmpfs_vnops.c b/sys/fs/tmpfs/tmpfs_vnops.c index 88e0939..97d3cc7 100644 --- a/sys/fs/tmpfs/tmpfs_vnops.c +++ b/sys/fs/tmpfs/tmpfs_vnops.c @@ -433,7 +433,6 @@ tmpfs_setattr(struct vop_setattr_args *v) return error; } =20 -/* --------------------------------------------------------------------- */ static int tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx, vm_offset_t offset, size_t tlen, struct uio *uio) @@ -449,12 +448,14 @@ tmpfs_nocacheread(vm_object_t tobj, vm_pindex_t idx, if (vm_pager_has_page(tobj, idx, NULL, NULL)) { error =3D vm_pager_get_pages(tobj, &m, 1, 0); if (error !=3D 0) { + vm_page_wakeup(m); printf("tmpfs get pages from pager error [read]\n"); goto out; } } else vm_page_zero_invalid(m, TRUE); } + vm_page_wakeup(m); VM_OBJECT_UNLOCK(tobj); error =3D uiomove_fromphys(&m, offset, tlen, uio); VM_OBJECT_LOCK(tobj); @@ -462,124 +463,26 @@ out: vm_page_lock(m); vm_page_unwire(m, TRUE); vm_page_unlock(m); - vm_page_wakeup(m); vm_object_pip_subtract(tobj, 1); VM_OBJECT_UNLOCK(tobj); =20 return (error); } =20 -static __inline int -tmpfs_nocacheread_buf(vm_object_t tobj, vm_pindex_t idx, - vm_offset_t offset, size_t tlen, void *buf) -{ - struct uio uio; - struct iovec iov; - - uio.uio_iovcnt =3D 1; - uio.uio_iov =3D &iov; - iov.iov_base =3D buf; - iov.iov_len =3D tlen; - - uio.uio_offset =3D 0; - uio.uio_resid =3D tlen; - uio.uio_rw =3D UIO_READ; - uio.uio_segflg =3D UIO_SYSSPACE; - uio.uio_td =3D curthread; - - return (tmpfs_nocacheread(tobj, idx, offset, tlen, &uio)); -} - -static int -tmpfs_mappedread(vm_object_t vobj, vm_object_t tobj, size_t len, struct ui= o *uio) -{ - struct sf_buf *sf; - vm_pindex_t idx; - vm_page_t m; - vm_offset_t offset; - off_t addr; - size_t tlen; - char *ma; - int error; - - addr =3D uio->uio_offset; - idx =3D OFF_TO_IDX(addr); - offset =3D addr & PAGE_MASK; - tlen =3D MIN(PAGE_SIZE - offset, len); - - if ((vobj =3D=3D NULL) || - (vobj->resident_page_count =3D=3D 0 && vobj->cache =3D=3D NULL)) - goto nocache; - - VM_OBJECT_LOCK(vobj); -lookupvpg: - if (((m =3D vm_page_lookup(vobj, idx)) !=3D NULL) && - vm_page_is_valid(m, offset, tlen)) { - if ((m->oflags & VPO_BUSY) !=3D 0) { - /* - * Reference the page before unlocking and sleeping so - * that the page daemon is less likely to reclaim it. =20 - */ - vm_page_lock_queues(); - vm_page_flag_set(m, PG_REFERENCED); - vm_page_sleep(m, "tmfsmr"); - goto lookupvpg; - } - vm_page_busy(m); - VM_OBJECT_UNLOCK(vobj); - error =3D uiomove_fromphys(&m, offset, tlen, uio); - VM_OBJECT_LOCK(vobj); - vm_page_wakeup(m); - VM_OBJECT_UNLOCK(vobj); - return (error); - } else if (m !=3D NULL && uio->uio_segflg =3D=3D UIO_NOCOPY) { - if ((m->oflags & VPO_BUSY) !=3D 0) { - /* - * Reference the page before unlocking and sleeping so - * that the page daemon is less likely to reclaim it. =20 - */ - vm_page_lock_queues(); - vm_page_flag_set(m, PG_REFERENCED); - vm_page_sleep(m, "tmfsmr"); - goto lookupvpg; - } - vm_page_busy(m); - VM_OBJECT_UNLOCK(vobj); - sched_pin(); - sf =3D sf_buf_alloc(m, SFB_CPUPRIVATE); - ma =3D (char *)sf_buf_kva(sf); - error =3D tmpfs_nocacheread_buf(tobj, idx, offset, tlen, - ma + offset); - if (error =3D=3D 0) { - uio->uio_offset +=3D tlen; - uio->uio_resid -=3D tlen; - } - sf_buf_free(sf); - sched_unpin(); - VM_OBJECT_LOCK(vobj); - vm_page_wakeup(m); - VM_OBJECT_UNLOCK(vobj); - return (error); - } - VM_OBJECT_UNLOCK(vobj); -nocache: - error =3D tmpfs_nocacheread(tobj, idx, offset, tlen, uio); - - return (error); -} - static int tmpfs_read(struct vop_read_args *v) { struct vnode *vp =3D v->a_vp; struct uio *uio =3D v->a_uio; - struct tmpfs_node *node; vm_object_t uobj; size_t len; int resid; - int error =3D 0; + vm_pindex_t idx; + vm_offset_t offset; + off_t addr; + size_t tlen; =20 node =3D VP_TO_TMPFS_NODE(vp); =20 @@ -603,7 +506,11 @@ tmpfs_read(struct vop_read_args *v) len =3D MIN(node->tn_size - uio->uio_offset, resid); if (len =3D=3D 0) break; - error =3D tmpfs_mappedread(vp->v_object, uobj, len, uio); + addr =3D uio->uio_offset; + idx =3D OFF_TO_IDX(addr); + offset =3D addr & PAGE_MASK; + tlen =3D MIN(PAGE_SIZE - offset, len); + error =3D tmpfs_nocacheread(uobj, idx, offset, tlen, uio); if ((error !=3D 0) || (resid =3D=3D uio->uio_resid)) break; } @@ -616,10 +523,10 @@ out: /* --------------------------------------------------------------------- */ =20 static int -tmpfs_mappedwrite(vm_object_t vobj, vm_object_t tobj, size_t len, struct u= io *uio) +tmpfs_mappedwrite(vm_object_t tobj, size_t len, struct uio *uio) { vm_pindex_t idx; - vm_page_t vpg, tpg; + vm_page_t tpg; vm_offset_t offset; off_t addr; size_t tlen; @@ -632,37 +539,6 @@ tmpfs_mappedwrite(vm_object_t vobj, vm_object_t tobj, = size_t len, struct uio *ui offset =3D addr & PAGE_MASK; tlen =3D MIN(PAGE_SIZE - offset, len); =20 - if ((vobj =3D=3D NULL) || - (vobj->resident_page_count =3D=3D 0 && vobj->cache =3D=3D NULL)) { - vpg =3D NULL; - goto nocache; - } - - VM_OBJECT_LOCK(vobj); -lookupvpg: - if (((vpg =3D vm_page_lookup(vobj, idx)) !=3D NULL) && - vm_page_is_valid(vpg, offset, tlen)) { - if ((vpg->oflags & VPO_BUSY) !=3D 0) { - /* - * Reference the page before unlocking and sleeping so - * that the page daemon is less likely to reclaim it. =20 - */ - vm_page_lock_queues(); - vm_page_flag_set(vpg, PG_REFERENCED); - vm_page_sleep(vpg, "tmfsmw"); - goto lookupvpg; - } - vm_page_busy(vpg); - vm_page_undirty(vpg); - VM_OBJECT_UNLOCK(vobj); - error =3D uiomove_fromphys(&vpg, offset, tlen, uio); - } else { - if (__predict_false(vobj->cache !=3D NULL)) - vm_page_cache_free(vobj, idx, idx + 1); - VM_OBJECT_UNLOCK(vobj); - vpg =3D NULL; - } -nocache: VM_OBJECT_LOCK(tobj); vm_object_pip_add(tobj, 1); tpg =3D vm_page_grab(tobj, idx, VM_ALLOC_WIRED | @@ -671,23 +547,18 @@ nocache: if (vm_pager_has_page(tobj, idx, NULL, NULL)) { error =3D vm_pager_get_pages(tobj, &tpg, 1, 0); if (error !=3D 0) { + vm_page_wakeup(tpg); printf("tmpfs get pages from pager error [write]\n"); goto out; } } else vm_page_zero_invalid(tpg, TRUE); } + vm_page_wakeup(tpg); VM_OBJECT_UNLOCK(tobj); - if (vpg =3D=3D NULL) - error =3D uiomove_fromphys(&tpg, offset, tlen, uio); - else { - KASSERT(vpg->valid =3D=3D VM_PAGE_BITS_ALL, ("parts of vpg invalid")); - pmap_copy_page(vpg, tpg); - } + error =3D uiomove_fromphys(&tpg, offset, tlen, uio); VM_OBJECT_LOCK(tobj); out: - if (vobj !=3D NULL) - VM_OBJECT_LOCK(vobj); if (error =3D=3D 0) { KASSERT(tpg->valid =3D=3D VM_PAGE_BITS_ALL, ("parts of tpg invalid")); @@ -696,11 +567,6 @@ out: vm_page_lock(tpg); vm_page_unwire(tpg, TRUE); vm_page_unlock(tpg); - vm_page_wakeup(tpg); - if (vpg !=3D NULL) - vm_page_wakeup(vpg); - if (vobj !=3D NULL) - VM_OBJECT_UNLOCK(vobj); vm_object_pip_subtract(tobj, 1); VM_OBJECT_UNLOCK(tobj); =20 @@ -759,7 +625,7 @@ tmpfs_write(struct vop_write_args *v) len =3D MIN(node->tn_size - uio->uio_offset, resid); if (len =3D=3D 0) break; - error =3D tmpfs_mappedwrite(vp->v_object, uobj, len, uio); + error =3D tmpfs_mappedwrite(uobj, len, uio); if ((error !=3D 0) || (resid =3D=3D uio->uio_resid)) break; } @@ -1425,7 +1291,7 @@ tmpfs_reclaim(struct vop_reclaim_args *v) node =3D VP_TO_TMPFS_NODE(vp); tmp =3D VFS_TO_TMPFS(vp->v_mount); =20 - vnode_destroy_vobject(vp); + vp->v_object =3D NULL; cache_purge(vp); =20 TMPFS_NODE_LOCK(node); diff --git a/sys/kern/imgact_elf.c b/sys/kern/imgact_elf.c index c48e0f5..754092f 100644 --- a/sys/kern/imgact_elf.c +++ b/sys/kern/imgact_elf.c @@ -447,7 +447,7 @@ __elfN(load_section)(struct vmspace *vmspace, * While I'm here, might as well check for something else that * is invalid: filsz cannot be greater than memsz. */ - if ((off_t)filsz + offset > object->un_pager.vnp.vnp_size || + if ((off_t)filsz + offset > object->vnp_size || filsz > memsz) { uprintf("elf_load_section: truncated ELF file\n"); return (ENOEXEC); diff --git a/sys/kern/uipc_syscalls.c b/sys/kern/uipc_syscalls.c index adcb852..ee80b3e 100644 --- a/sys/kern/uipc_syscalls.c +++ b/sys/kern/uipc_syscalls.c @@ -2033,12 +2033,12 @@ retry_space: */ pgoff =3D (vm_offset_t)(off & PAGE_MASK); xfsize =3D omin(PAGE_SIZE - pgoff, - obj->un_pager.vnp.vnp_size - uap->offset - + obj->vnp_size - uap->offset - fsbytes - loopbytes); if (uap->nbytes) rem =3D (uap->nbytes - fsbytes - loopbytes); else - rem =3D obj->un_pager.vnp.vnp_size - + rem =3D obj->vnp_size - uap->offset - fsbytes - loopbytes; xfsize =3D omin(rem, xfsize); xfsize =3D omin(space - loopbytes, xfsize); diff --git a/sys/vm/vm_mmap.c b/sys/vm/vm_mmap.c index 3d72123..ff06892 100644 --- a/sys/vm/vm_mmap.c +++ b/sys/vm/vm_mmap.c @@ -1222,7 +1222,7 @@ vm_mmap_vnode(struct thread *td, vm_size_t objsize, error =3D EINVAL; goto done; } - if (obj->handle !=3D vp) { + if (obj->type =3D=3D OBJT_VNODE && obj->handle !=3D vp) { vput(vp); vp =3D (struct vnode*)obj->handle; vget(vp, LK_SHARED, td); @@ -1261,7 +1261,14 @@ vm_mmap_vnode(struct thread *td, vm_size_t objsize, objsize =3D round_page(va.va_size); if (va.va_nlink =3D=3D 0) flags |=3D MAP_NOSYNC; - obj =3D vm_pager_allocate(OBJT_VNODE, vp, objsize, prot, foff, td->td_ucr= ed); + if (obj->type =3D=3D OBJT_VNODE) + obj =3D vm_pager_allocate(OBJT_VNODE, vp, objsize, prot, foff, + td->td_ucred); + else { + KASSERT(obj->type =3D=3D OBJT_DEFAULT || obj->type =3D=3D OBJT_SWAP, + ("wrong object type")); + vm_object_reference(obj); + } if (obj =3D=3D NULL) { error =3D ENOMEM; goto done; diff --git a/sys/vm/vm_object.h b/sys/vm/vm_object.h index 6a9f129..0120d32 100644 --- a/sys/vm/vm_object.h +++ b/sys/vm/vm_object.h @@ -106,15 +106,6 @@ struct vm_object { void *handle; union { /* - * VNode pager - * - * vnp_size - current size of file - */ - struct { - off_t vnp_size; - } vnp; - - /* * Device pager * * devp_pglist - list of allocated pages @@ -145,6 +136,7 @@ struct vm_object { } un_pager; struct uidinfo *uip; vm_ooffset_t charge; + off_t vnp_size; /* current size of file for vnode pager */ }; =20 /* diff --git a/sys/vm/vnode_pager.c b/sys/vm/vnode_pager.c index f497d41..a1cfc01 100644 --- a/sys/vm/vnode_pager.c +++ b/sys/vm/vnode_pager.c @@ -212,8 +212,7 @@ retry: msleep(object, VM_OBJECT_MTX(object), PDROP | PVM, "vadead", 0); } =20 - if (vp->v_usecount =3D=3D 0) - panic("vnode_pager_alloc: no vnode reference"); + KASSERT(vp->v_usecount !=3D 0, ("vnode_pager_alloc: no vnode reference")); =20 if (object =3D=3D NULL) { /* @@ -221,7 +220,7 @@ retry: */ object =3D vm_object_allocate(OBJT_VNODE, OFF_TO_IDX(round_page(size))); =20 - object->un_pager.vnp.vnp_size =3D size; + object->vnp_size =3D size; =20 object->handle =3D handle; VI_LOCK(vp); @@ -301,7 +300,7 @@ vnode_pager_haspage(object, pindex, before, after) * If the offset is beyond end of file we do * not have the page. */ - if (IDX_TO_OFF(pindex) >=3D object->un_pager.vnp.vnp_size) + if (IDX_TO_OFF(pindex) >=3D object->vnp_size) return FALSE; =20 bsize =3D vp->v_mount->mnt_stat.f_iosize; @@ -333,9 +332,8 @@ vnode_pager_haspage(object, pindex, before, after) *after *=3D pagesperblock; numafter =3D pagesperblock - (poff + 1); if (IDX_TO_OFF(pindex + numafter) > - object->un_pager.vnp.vnp_size) { - numafter =3D - OFF_TO_IDX(object->un_pager.vnp.vnp_size) - + object->vnp_size) { + numafter =3D OFF_TO_IDX(object->vnp_size) - pindex; } *after +=3D numafter; @@ -369,11 +367,11 @@ vnode_pager_setsize(vp, nsize) vm_page_t m; vm_pindex_t nobjsize; =20 - if ((object =3D vp->v_object) =3D=3D NULL) + if ((object =3D vp->v_object) =3D=3D NULL || object->type !=3D OBJT_VNODE) return; /* ASSERT_VOP_ELOCKED(vp, "vnode_pager_setsize and not locked vnode"); */ VM_OBJECT_LOCK(object); - if (nsize =3D=3D object->un_pager.vnp.vnp_size) { + if (nsize =3D=3D object->vnp_size) { /* * Hasn't changed size */ @@ -381,7 +379,7 @@ vnode_pager_setsize(vp, nsize) return; } nobjsize =3D OFF_TO_IDX(nsize + PAGE_MASK); - if (nsize < object->un_pager.vnp.vnp_size) { + if (nsize < object->vnp_size) { /* * File has shrunk. Toss any cached pages beyond the new EOF. */ @@ -436,7 +434,7 @@ vnode_pager_setsize(vp, nsize) nobjsize); } } - object->un_pager.vnp.vnp_size =3D nsize; + object->vnp_size =3D nsize; object->size =3D nobjsize; VM_OBJECT_UNLOCK(object); } @@ -513,7 +511,7 @@ vnode_pager_input_smlfs(object, m) continue; =20 address =3D IDX_TO_OFF(m->pindex) + i * bsize; - if (address >=3D object->un_pager.vnp.vnp_size) { + if (address >=3D object->vnp_size) { fileaddr =3D -1; } else { error =3D vnode_pager_addr(vp, address, &fileaddr, NULL); @@ -590,12 +588,12 @@ vnode_pager_input_old(object, m) /* * Return failure if beyond current EOF */ - if (IDX_TO_OFF(m->pindex) >=3D object->un_pager.vnp.vnp_size) { + if (IDX_TO_OFF(m->pindex) >=3D object->vnp_size) { return VM_PAGER_BAD; } else { size =3D PAGE_SIZE; - if (IDX_TO_OFF(m->pindex) + size > object->un_pager.vnp.vnp_size) - size =3D object->un_pager.vnp.vnp_size - IDX_TO_OFF(m->pindex); + if (IDX_TO_OFF(m->pindex) + size > object->vnp_size) + size =3D object->vnp_size - IDX_TO_OFF(m->pindex); vp =3D object->handle; VM_OBJECT_UNLOCK(object); =20 @@ -815,13 +813,13 @@ vnode_pager_generic_getpages(vp, m, bytecount, reqpag= e) } if (firstaddr =3D=3D -1) { VM_OBJECT_LOCK(object); - if (i =3D=3D reqpage && foff < object->un_pager.vnp.vnp_size) { + if (i =3D=3D reqpage && foff < object->vnp_size) { panic("vnode_pager_getpages: unexpected missing page: firstaddr: %jd, = foff: 0x%jx%08jx, vnp_size: 0x%jx%08jx", (intmax_t)firstaddr, (uintmax_t)(foff >> 32), (uintmax_t)foff, (uintmax_t) - (object->un_pager.vnp.vnp_size >> 32), - (uintmax_t)object->un_pager.vnp.vnp_size); + (object->vnp_size >> 32), + (uintmax_t)object->vnp_size); } vm_page_lock(m[i]); vm_page_free(m[i]); @@ -876,8 +874,8 @@ vnode_pager_generic_getpages(vp, m, bytecount, reqpage) */ size =3D count * PAGE_SIZE; KASSERT(count > 0, ("zero count")); - if ((foff + size) > object->un_pager.vnp.vnp_size) - size =3D object->un_pager.vnp.vnp_size - foff; + if ((foff + size) > object->vnp_size) + size =3D object->vnp_size - foff; KASSERT(size > 0, ("zero size")); =20 /* @@ -944,7 +942,7 @@ vnode_pager_generic_getpages(vp, m, bytecount, reqpage) nextoff =3D tfoff + PAGE_SIZE; mt =3D m[i]; =20 - if (nextoff <=3D object->un_pager.vnp.vnp_size) { + if (nextoff <=3D object->vnp_size) { /* * Read filled up entire page. */ @@ -964,9 +962,9 @@ vnode_pager_generic_getpages(vp, m, bytecount, reqpage) * read. */ vm_page_set_valid(mt, 0, - object->un_pager.vnp.vnp_size - tfoff); + object->vnp_size - tfoff); KASSERT((mt->dirty & vm_page_bits(0, - object->un_pager.vnp.vnp_size - tfoff)) =3D=3D 0, + object->vnp_size - tfoff)) =3D=3D 0, ("vnode_pager_generic_getpages: page %p is dirty", mt)); } @@ -1116,11 +1114,11 @@ vnode_pager_generic_putpages(struct vnode *vp, vm_p= age_t *ma, int bytecount, * this will screw up bogus page replacement. */ VM_OBJECT_LOCK(object); - if (maxsize + poffset > object->un_pager.vnp.vnp_size) { - if (object->un_pager.vnp.vnp_size > poffset) { + if (maxsize + poffset > object->vnp_size) { + if (object->vnp_size > poffset) { int pgoff; =20 - maxsize =3D object->un_pager.vnp.vnp_size - poffset; + maxsize =3D object->vnp_size - poffset; ncount =3D btoc(maxsize); if ((pgoff =3D (int)maxsize & PAGE_MASK) !=3D 0) { /* --zd5GkkQQtETumrwc Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwfYfEACgkQC3+MBN1Mb4hOPACg3aznl4eTBeO3QOCKEFZsRsSO 5kwAniuYlwXBbxmU8wHXsLsweO8LTGwV =8WDX -----END PGP SIGNATURE----- --zd5GkkQQtETumrwc-- From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 15:43:46 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38C881065670; Mon, 21 Jun 2010 15:43:46 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0B22B8FC20; Mon, 21 Jun 2010 15:43:46 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id B2E5946B4C; Mon, 21 Jun 2010 11:43:45 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A4C4F8A04E; Mon, 21 Jun 2010 11:43:44 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org Date: Mon, 21 Jun 2010 10:30:55 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201006211030.55327.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 21 Jun 2010 11:43:44 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 15:43:46 -0000 On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > Hi, > Below is the patch that eliminates second copy of the data kept by tmpfs > in case a file is mapped. Also, it removes potential deadlocks due to > tmpfs doing copyin/out while page is busy. It is possible that patch > also fixes known issue with sendfile(2) of tmpfs file, but I did not > verified this. > > Patch essentially consists of three parts: > - move of vm_object' vnp_size from the type-discriminated union to the > vm_object proper; > - making vm not choke when vm object held in the struct vnode' v_object > is default or swap object instead of vnode object; > - use of the swap object that keeps data for tmpfs VREG file, also as > v_object. > > Peter Holm helped me with the patch, apparently we survive fsx and stress2. Why did you have to move vnp_size out of the union? Is tmpfs using a non- OBJT_VNODE object to hold file data? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 15:43:46 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 38C881065670; Mon, 21 Jun 2010 15:43:46 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0B22B8FC20; Mon, 21 Jun 2010 15:43:46 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id B2E5946B4C; Mon, 21 Jun 2010 11:43:45 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id A4C4F8A04E; Mon, 21 Jun 2010 11:43:44 -0400 (EDT) From: John Baldwin To: freebsd-fs@freebsd.org Date: Mon, 21 Jun 2010 10:30:55 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201006211030.55327.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 21 Jun 2010 11:43:44 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 15:43:46 -0000 On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > Hi, > Below is the patch that eliminates second copy of the data kept by tmpfs > in case a file is mapped. Also, it removes potential deadlocks due to > tmpfs doing copyin/out while page is busy. It is possible that patch > also fixes known issue with sendfile(2) of tmpfs file, but I did not > verified this. > > Patch essentially consists of three parts: > - move of vm_object' vnp_size from the type-discriminated union to the > vm_object proper; > - making vm not choke when vm object held in the struct vnode' v_object > is default or swap object instead of vnode object; > - use of the swap object that keeps data for tmpfs VREG file, also as > v_object. > > Peter Holm helped me with the patch, apparently we survive fsx and stress2. Why did you have to move vnp_size out of the union? Is tmpfs using a non- OBJT_VNODE object to hold file data? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 17:47:32 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FDA1106564A for ; Mon, 21 Jun 2010 17:47:32 +0000 (UTC) (envelope-from bruce@cran.org.uk) Received: from muon.cran.org.uk (muon.cran.org.uk [204.109.60.94]) by mx1.freebsd.org (Postfix) with ESMTP id CD7A98FC13 for ; Mon, 21 Jun 2010 17:47:31 +0000 (UTC) Received: from core.draftnet (87-194-158-129.bethere.co.uk [87.194.158.129]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by muon.cran.org.uk (Postfix) with ESMTPSA id 45F7C5D19 for ; Mon, 21 Jun 2010 17:47:30 +0000 (UTC) From: Bruce Cran To: freebsd-fs@freebsd.org Date: Mon, 21 Jun 2010 18:47:27 +0100 User-Agent: KMail/1.13.3 (FreeBSD/9.0-CURRENT; KDE/4.4.4; amd64; ; ) MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_vW6HMz2ItXB/wwO" Message-Id: <201006211847.27788.bruce@cran.org.uk> Subject: Patch to fix reported mountpoint when mounting dirty filesystem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 17:47:32 -0000 --Boundary-00=_vW6HMz2ItXB/wwO Content-Type: Text/Plain; charset="us-ascii" Content-Transfer-Encoding: 7bit I've been investigating PR bin/19683 which is about the fact that when a R/W mount is denied due to the filesystem being dirty, the reported mountpoint is that which has been recorded in the superblock, not the directory that it's currently being mounted on. I've attached a potential patch but I'm not sure if f_mntonname is valid at the first print statement. Also, I tried to replicate the problem with ext2fs but the despite code being present which prints the mountpoint, that code was never hit so I'm not sure if it needs updated too? -- Bruce Cran --Boundary-00=_vW6HMz2ItXB/wwO Content-Type: text/x-patch; charset="us-ascii"; name="ffs_vfsopc.c.diff" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="ffs_vfsopc.c.diff" Index: sys/ufs/ffs/ffs_vfsops.c =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =2D-- sys/ufs/ffs/ffs_vfsops.c (revision 209306) +++ sys/ufs/ffs/ffs_vfsops.c (working copy) @@ -315,7 +315,7 @@ } else { printf( "WARNING: R/W mount of %s denied. Filesystem is not clean - run fsck\n", =2D fs->fs_fsmnt); + mp->mnt_stat.f_mntonname); if (fs->fs_flags & FS_SUJ) printf( "WARNING: Forced mount will invalidated journal contents\n"); @@ -726,7 +726,7 @@ } else { printf( "WARNING: R/W mount of %s denied. Filesystem is not clean - run fsck\n", =2D fs->fs_fsmnt); + mp->mnt_stat.f_mntonname); if (fs->fs_flags & FS_SUJ) printf( "WARNING: Forced mount will invalidated journal contents\n"); --Boundary-00=_vW6HMz2ItXB/wwO-- From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 18:49:34 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 26754106564A; Mon, 21 Jun 2010 18:49:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id ADEF98FC19; Mon, 21 Jun 2010 18:49:33 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5LInTDh063677 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 21 Jun 2010 21:49:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5LInSwo019822; Mon, 21 Jun 2010 21:49:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5LInSfP019821; Mon, 21 Jun 2010 21:49:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 21 Jun 2010 21:49:28 +0300 From: Kostik Belousov To: John Baldwin Message-ID: <20100621184928.GI13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0RrwLCq8kVvUIgaI" Content-Disposition: inline In-Reply-To: <201006211030.55327.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 18:49:34 -0000 --0RrwLCq8kVvUIgaI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote: > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > > Hi, > > Below is the patch that eliminates second copy of the data kept by tmpfs > > in case a file is mapped. Also, it removes potential deadlocks due to > > tmpfs doing copyin/out while page is busy. It is possible that patch > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > verified this. > >=20 > > Patch essentially consists of three parts: > > - move of vm_object' vnp_size from the type-discriminated union to the > > vm_object proper; > > - making vm not choke when vm object held in the struct vnode' v_object > > is default or swap object instead of vnode object; > > - use of the swap object that keeps data for tmpfs VREG file, also as > > v_object. > >=20 > > Peter Holm helped me with the patch, apparently we survive fsx and stre= ss2. >=20 > Why did you have to move vnp_size out of the union? Is tmpfs using a non- > OBJT_VNODE object to hold file data? Tmpfs uses OBJT_SWAP object to keep the data pages for the files. Current code allocates another object of type OBJT_VNODE, assigned to vp->v_object, to satisfy VM interface for mapping the file, using vnode_create_vobject. The objects do not share the pages (I do not think this can be easily achieved without serious changes to VM). Thus most, if not all, the data is present in two sets of pages. When such file is written to, tmpfs copies user buffer both to the swap object, and to the v_object. Patch I posted assigns the swap object to the vp->v_object. I had to make small change to vm_mmap_vnode() to not allocate the vnode pager and to not increment vnode use counter when v_object is the swap object. vnp_size has to be provided on the object layer because our swap object is used to e.g. mmap the executables from tmpfs, and image activation code relies on vnp_size instead of slower VOP_GETATTR(). I think this route is easier then converting all vnp_size users to VOP_GETATTR for only tmpfs benefit. --0RrwLCq8kVvUIgaI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwftDgACgkQC3+MBN1Mb4j9DQCfcBNU1ETDK/wVNHyB9huR43We eQcAn3D08nw+Np4HuCU4/sRdF1Na56oA =eSjg -----END PGP SIGNATURE----- --0RrwLCq8kVvUIgaI-- From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 18:49:34 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 26754106564A; Mon, 21 Jun 2010 18:49:34 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id ADEF98FC19; Mon, 21 Jun 2010 18:49:33 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5LInTDh063677 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 21 Jun 2010 21:49:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5LInSwo019822; Mon, 21 Jun 2010 21:49:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5LInSfP019821; Mon, 21 Jun 2010 21:49:28 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 21 Jun 2010 21:49:28 +0300 From: Kostik Belousov To: John Baldwin Message-ID: <20100621184928.GI13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0RrwLCq8kVvUIgaI" Content-Disposition: inline In-Reply-To: <201006211030.55327.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 18:49:34 -0000 --0RrwLCq8kVvUIgaI Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote: > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > > Hi, > > Below is the patch that eliminates second copy of the data kept by tmpfs > > in case a file is mapped. Also, it removes potential deadlocks due to > > tmpfs doing copyin/out while page is busy. It is possible that patch > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > verified this. > >=20 > > Patch essentially consists of three parts: > > - move of vm_object' vnp_size from the type-discriminated union to the > > vm_object proper; > > - making vm not choke when vm object held in the struct vnode' v_object > > is default or swap object instead of vnode object; > > - use of the swap object that keeps data for tmpfs VREG file, also as > > v_object. > >=20 > > Peter Holm helped me with the patch, apparently we survive fsx and stre= ss2. >=20 > Why did you have to move vnp_size out of the union? Is tmpfs using a non- > OBJT_VNODE object to hold file data? Tmpfs uses OBJT_SWAP object to keep the data pages for the files. Current code allocates another object of type OBJT_VNODE, assigned to vp->v_object, to satisfy VM interface for mapping the file, using vnode_create_vobject. The objects do not share the pages (I do not think this can be easily achieved without serious changes to VM). Thus most, if not all, the data is present in two sets of pages. When such file is written to, tmpfs copies user buffer both to the swap object, and to the v_object. Patch I posted assigns the swap object to the vp->v_object. I had to make small change to vm_mmap_vnode() to not allocate the vnode pager and to not increment vnode use counter when v_object is the swap object. vnp_size has to be provided on the object layer because our swap object is used to e.g. mmap the executables from tmpfs, and image activation code relies on vnp_size instead of slower VOP_GETATTR(). I think this route is easier then converting all vnp_size users to VOP_GETATTR for only tmpfs benefit. --0RrwLCq8kVvUIgaI Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwftDgACgkQC3+MBN1Mb4j9DQCfcBNU1ETDK/wVNHyB9huR43We eQcAn3D08nw+Np4HuCU4/sRdF1Na56oA =eSjg -----END PGP SIGNATURE----- --0RrwLCq8kVvUIgaI-- From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 20:15:55 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 42945106564A; Mon, 21 Jun 2010 20:15:55 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 12FA98FC14; Mon, 21 Jun 2010 20:15:55 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id BD77046B29; Mon, 21 Jun 2010 16:15:54 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 5710F8A04E; Mon, 21 Jun 2010 16:15:53 -0400 (EDT) From: John Baldwin To: Kostik Belousov Date: Mon, 21 Jun 2010 16:15:22 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100621184928.GI13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201006211615.22758.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 21 Jun 2010 16:15:53 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 20:15:55 -0000 On Monday 21 June 2010 2:49:28 pm Kostik Belousov wrote: > On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote: > > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > > > Hi, > > > Below is the patch that eliminates second copy of the data kept by tmpfs > > > in case a file is mapped. Also, it removes potential deadlocks due to > > > tmpfs doing copyin/out while page is busy. It is possible that patch > > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > > verified this. > > > > > > Patch essentially consists of three parts: > > > - move of vm_object' vnp_size from the type-discriminated union to the > > > vm_object proper; > > > - making vm not choke when vm object held in the struct vnode' v_object > > > is default or swap object instead of vnode object; > > > - use of the swap object that keeps data for tmpfs VREG file, also as > > > v_object. > > > > > > Peter Holm helped me with the patch, apparently we survive fsx and stress2. > > > > Why did you have to move vnp_size out of the union? Is tmpfs using a non- > > OBJT_VNODE object to hold file data? > Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > Current code allocates another object of type OBJT_VNODE, assigned > to vp->v_object, to satisfy VM interface for mapping the file, using > vnode_create_vobject. The objects do not share the pages (I do not think > this can be easily achieved without serious changes to VM). Thus most, > if not all, the data is present in two sets of pages. > > When such file is written to, tmpfs copies user buffer both to the swap > object, and to the v_object. > > Patch I posted assigns the swap object to the vp->v_object. I had to > make small change to vm_mmap_vnode() to not allocate the vnode pager > and to not increment vnode use counter when v_object is the swap > object. > > vnp_size has to be provided on the object layer because our swap > object is used to e.g. mmap the executables from tmpfs, and image > activation code relies on vnp_size instead of slower VOP_GETATTR(). > I think this route is easier then converting all vnp_size users to > VOP_GETATTR for only tmpfs benefit. Ok, thanks for the expanded explanation. :) It seems a shame to have to move vnp_size out of the pager-specific data. Maybe add a comment in vm_object.h to say that vnp_size is used by multiple object types which is why it can't be vnode-specific anymore? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 20:15:55 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 42945106564A; Mon, 21 Jun 2010 20:15:55 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 12FA98FC14; Mon, 21 Jun 2010 20:15:55 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id BD77046B29; Mon, 21 Jun 2010 16:15:54 -0400 (EDT) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPSA id 5710F8A04E; Mon, 21 Jun 2010 16:15:53 -0400 (EDT) From: John Baldwin To: Kostik Belousov Date: Mon, 21 Jun 2010 16:15:22 -0400 User-Agent: KMail/1.12.1 (FreeBSD/7.3-CBSD-20100217; KDE/4.3.1; amd64; ; ) References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100621184928.GI13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Message-Id: <201006211615.22758.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Mon, 21 Jun 2010 16:15:53 -0400 (EDT) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-2.5 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 20:15:55 -0000 On Monday 21 June 2010 2:49:28 pm Kostik Belousov wrote: > On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote: > > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > > > Hi, > > > Below is the patch that eliminates second copy of the data kept by tmpfs > > > in case a file is mapped. Also, it removes potential deadlocks due to > > > tmpfs doing copyin/out while page is busy. It is possible that patch > > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > > verified this. > > > > > > Patch essentially consists of three parts: > > > - move of vm_object' vnp_size from the type-discriminated union to the > > > vm_object proper; > > > - making vm not choke when vm object held in the struct vnode' v_object > > > is default or swap object instead of vnode object; > > > - use of the swap object that keeps data for tmpfs VREG file, also as > > > v_object. > > > > > > Peter Holm helped me with the patch, apparently we survive fsx and stress2. > > > > Why did you have to move vnp_size out of the union? Is tmpfs using a non- > > OBJT_VNODE object to hold file data? > Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > Current code allocates another object of type OBJT_VNODE, assigned > to vp->v_object, to satisfy VM interface for mapping the file, using > vnode_create_vobject. The objects do not share the pages (I do not think > this can be easily achieved without serious changes to VM). Thus most, > if not all, the data is present in two sets of pages. > > When such file is written to, tmpfs copies user buffer both to the swap > object, and to the v_object. > > Patch I posted assigns the swap object to the vp->v_object. I had to > make small change to vm_mmap_vnode() to not allocate the vnode pager > and to not increment vnode use counter when v_object is the swap > object. > > vnp_size has to be provided on the object layer because our swap > object is used to e.g. mmap the executables from tmpfs, and image > activation code relies on vnp_size instead of slower VOP_GETATTR(). > I think this route is easier then converting all vnp_size users to > VOP_GETATTR for only tmpfs benefit. Ok, thanks for the expanded explanation. :) It seems a shame to have to move vnp_size out of the pager-specific data. Maybe add a comment in vm_object.h to say that vnp_size is used by multiple object types which is why it can't be vnode-specific anymore? -- John Baldwin From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 20:49:49 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 217D0106564A; Mon, 21 Jun 2010 20:49:49 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id AB7638FC15; Mon, 21 Jun 2010 20:49:48 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5LKnjqP072823 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 21 Jun 2010 23:49:45 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5LKninf020507; Mon, 21 Jun 2010 23:49:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5LKniIF020506; Mon, 21 Jun 2010 23:49:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 21 Jun 2010 23:49:44 +0300 From: Kostik Belousov To: John Baldwin Message-ID: <20100621204944.GM13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <201006211615.22758.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3pZv5aVBMTAsT+kh" Content-Disposition: inline In-Reply-To: <201006211615.22758.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 20:49:49 -0000 --3pZv5aVBMTAsT+kh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jun 21, 2010 at 04:15:22PM -0400, John Baldwin wrote: > On Monday 21 June 2010 2:49:28 pm Kostik Belousov wrote: > > On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote: > > > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > > > > Hi, > > > > Below is the patch that eliminates second copy of the data kept by = tmpfs > > > > in case a file is mapped. Also, it removes potential deadlocks due = to > > > > tmpfs doing copyin/out while page is busy. It is possible that patch > > > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > > > verified this. > > > >=20 > > > > Patch essentially consists of three parts: > > > > - move of vm_object' vnp_size from the type-discriminated union to = the > > > > vm_object proper; > > > > - making vm not choke when vm object held in the struct vnode' v_ob= ject > > > > is default or swap object instead of vnode object; > > > > - use of the swap object that keeps data for tmpfs VREG file, also = as > > > > v_object. > > > >=20 > > > > Peter Holm helped me with the patch, apparently we survive fsx and= =20 > stress2. > > >=20 > > > Why did you have to move vnp_size out of the union? Is tmpfs using a= non- > > > OBJT_VNODE object to hold file data? > > Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > > Current code allocates another object of type OBJT_VNODE, assigned > > to vp->v_object, to satisfy VM interface for mapping the file, using > > vnode_create_vobject. The objects do not share the pages (I do not think > > this can be easily achieved without serious changes to VM). Thus most, > > if not all, the data is present in two sets of pages. > >=20 > > When such file is written to, tmpfs copies user buffer both to the swap > > object, and to the v_object. > >=20 > > Patch I posted assigns the swap object to the vp->v_object. I had to > > make small change to vm_mmap_vnode() to not allocate the vnode pager > > and to not increment vnode use counter when v_object is the swap > > object. > >=20 > > vnp_size has to be provided on the object layer because our swap > > object is used to e.g. mmap the executables from tmpfs, and image > > activation code relies on vnp_size instead of slower VOP_GETATTR(). > > I think this route is easier then converting all vnp_size users to > > VOP_GETATTR for only tmpfs benefit. >=20 > Ok, thanks for the expanded explanation. :) It seems a shame to have > to move vnp_size out of the pager-specific data. Maybe add a comment > in vm_object.h to say that vnp_size is used by multiple object types > which is why it can't be vnode-specific anymore? Thanks for you note. I put the following comment into vm_object declaration. /* * Current size of file for vnode pager. * * The tmpfs uses OBJT_SWAP object for vnode v_object. To * satisfy vm_object consumers that use vnp_size for v_object, * tmpfs maintain vnp_size, and it have to be put outside * un_pager union. */ off_t vnp_size; --3pZv5aVBMTAsT+kh Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwf0GgACgkQC3+MBN1Mb4iLfQCg1Zp+BEC4nMYEAGSqYexLJM64 p1YAn2Dypx97C2fbOc0wGy4j6kVIbX1a =8S90 -----END PGP SIGNATURE----- --3pZv5aVBMTAsT+kh-- From owner-freebsd-fs@FreeBSD.ORG Mon Jun 21 20:49:49 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 217D0106564A; Mon, 21 Jun 2010 20:49:49 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id AB7638FC15; Mon, 21 Jun 2010 20:49:48 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5LKnjqP072823 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 21 Jun 2010 23:49:45 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5LKninf020507; Mon, 21 Jun 2010 23:49:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5LKniIF020506; Mon, 21 Jun 2010 23:49:44 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Mon, 21 Jun 2010 23:49:44 +0300 From: Kostik Belousov To: John Baldwin Message-ID: <20100621204944.GM13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <201006211615.22758.jhb@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3pZv5aVBMTAsT+kh" Content-Disposition: inline In-Reply-To: <201006211615.22758.jhb@freebsd.org> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jun 2010 20:49:49 -0000 --3pZv5aVBMTAsT+kh Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jun 21, 2010 at 04:15:22PM -0400, John Baldwin wrote: > On Monday 21 June 2010 2:49:28 pm Kostik Belousov wrote: > > On Mon, Jun 21, 2010 at 10:30:55AM -0400, John Baldwin wrote: > > > On Monday 21 June 2010 8:58:25 am Kostik Belousov wrote: > > > > Hi, > > > > Below is the patch that eliminates second copy of the data kept by = tmpfs > > > > in case a file is mapped. Also, it removes potential deadlocks due = to > > > > tmpfs doing copyin/out while page is busy. It is possible that patch > > > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > > > verified this. > > > >=20 > > > > Patch essentially consists of three parts: > > > > - move of vm_object' vnp_size from the type-discriminated union to = the > > > > vm_object proper; > > > > - making vm not choke when vm object held in the struct vnode' v_ob= ject > > > > is default or swap object instead of vnode object; > > > > - use of the swap object that keeps data for tmpfs VREG file, also = as > > > > v_object. > > > >=20 > > > > Peter Holm helped me with the patch, apparently we survive fsx and= =20 > stress2. > > >=20 > > > Why did you have to move vnp_size out of the union? Is tmpfs using a= non- > > > OBJT_VNODE object to hold file data? > > Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > > Current code allocates another object of type OBJT_VNODE, assigned > > to vp->v_object, to satisfy VM interface for mapping the file, using > > vnode_create_vobject. The objects do not share the pages (I do not think > > this can be easily achieved without serious changes to VM). Thus most, > > if not all, the data is present in two sets of pages. > >=20 > > When such file is written to, tmpfs copies user buffer both to the swap > > object, and to the v_object. > >=20 > > Patch I posted assigns the swap object to the vp->v_object. I had to > > make small change to vm_mmap_vnode() to not allocate the vnode pager > > and to not increment vnode use counter when v_object is the swap > > object. > >=20 > > vnp_size has to be provided on the object layer because our swap > > object is used to e.g. mmap the executables from tmpfs, and image > > activation code relies on vnp_size instead of slower VOP_GETATTR(). > > I think this route is easier then converting all vnp_size users to > > VOP_GETATTR for only tmpfs benefit. >=20 > Ok, thanks for the expanded explanation. :) It seems a shame to have > to move vnp_size out of the pager-specific data. Maybe add a comment > in vm_object.h to say that vnp_size is used by multiple object types > which is why it can't be vnode-specific anymore? Thanks for you note. I put the following comment into vm_object declaration. /* * Current size of file for vnode pager. * * The tmpfs uses OBJT_SWAP object for vnode v_object. To * satisfy vm_object consumers that use vnp_size for v_object, * tmpfs maintain vnp_size, and it have to be put outside * un_pager union. */ off_t vnp_size; --3pZv5aVBMTAsT+kh Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwf0GgACgkQC3+MBN1Mb4iLfQCg1Zp+BEC4nMYEAGSqYexLJM64 p1YAn2Dypx97C2fbOc0wGy4j6kVIbX1a =8S90 -----END PGP SIGNATURE----- --3pZv5aVBMTAsT+kh-- From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 07:13:48 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 44A47106566C; Tue, 22 Jun 2010 07:13:48 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 944FC8FC15; Tue, 22 Jun 2010 07:13:47 +0000 (UTC) Received: from outgoing.leidinger.net (pD9E2C147.dip.t-dialin.net [217.226.193.71]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id EA8FA844042; Tue, 22 Jun 2010 09:13:43 +0200 (CEST) Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102]) by outgoing.leidinger.net (Postfix) with ESMTP id E8B845464; Tue, 22 Jun 2010 09:13:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1277190821; bh=bNKRe9dI9JKK80Xmba4JnO2WvY75uBPWEsqe2cmzpso=; h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To: MIME-Version:Content-Type:Content-Transfer-Encoding; b=iMZkwdoCZQh8HWYv5OIhR3YOXBCKWniuUEG04j5JndgVFGclZH/cxhMp5qZNPgjBG tkQ8CFvo1VJH7bRXiIFn6JsfN6M7aocjDXE4zAfV3PL77EAoWMopd8InecWcggQMGb G/KyXlKw/sbJ3MWUV3aSc7rBxbm1BdGtwBJnDwXBzdNtbYbpBS5NqLduBh7E/Sqd0N VrgH/F3CyC9bgkmQjxAMKDrfJs0Dod28RMFOKp9NVmud4B9FWyqCP8sa+fRQMHQzEg Y6Z+jw/PppHjycOBkUXmzMtPVVrkFknw45+fxBk4LcyVmrenW6UnelRoxxeiXR8SuE 3+i0n8awUcE/A== Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5M7DeER071507; Tue, 22 Jun 2010 09:13:40 +0200 (CEST) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Tue, 22 Jun 2010 09:13:40 +0200 Message-ID: <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> Date: Tue, 22 Jun 2010 09:13:40 +0200 From: Alexander Leidinger To: Kostik Belousov References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100621184928.GI13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: EA8FA844042.A6CC9 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=-1.1, required 6, autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10, DKIM_VALID -0.10, DKIM_VALID_AU -0.10) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1277795625.41418@tFTYTAbB3VzeSsOT1qqv4g X-EBL-Spam-Status: No Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 07:13:48 -0000 Quoting Kostik Belousov (from Mon, 21 Jun 2010 21:49:28 +0300): > Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > Current code allocates another object of type OBJT_VNODE, assigned > to vp->v_object, to satisfy VM interface for mapping the file, using > vnode_create_vobject. The objects do not share the pages (I do not think > this can be easily achieved without serious changes to VM). Thus most, > if not all, the data is present in two sets of pages. > > When such file is written to, tmpfs copies user buffer both to the swap > object, and to the v_object. > > Patch I posted assigns the swap object to the vp->v_object. I had to > make small change to vm_mmap_vnode() to not allocate the vnode pager > and to not increment vnode use counter when v_object is the swap > object. Did you measure the performance before/after? If not, what are your performance expectations? I don't expect we get double the performance, but if every data of a write is copied twice, I would guess there is a measurable benefit. Bye, Alexander. -- At work, the authority of a person is inversely proportional to the number of pens that person is carrying. http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 07:13:48 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 44A47106566C; Tue, 22 Jun 2010 07:13:48 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 944FC8FC15; Tue, 22 Jun 2010 07:13:47 +0000 (UTC) Received: from outgoing.leidinger.net (pD9E2C147.dip.t-dialin.net [217.226.193.71]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id EA8FA844042; Tue, 22 Jun 2010 09:13:43 +0200 (CEST) Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102]) by outgoing.leidinger.net (Postfix) with ESMTP id E8B845464; Tue, 22 Jun 2010 09:13:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1277190821; bh=bNKRe9dI9JKK80Xmba4JnO2WvY75uBPWEsqe2cmzpso=; h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To: MIME-Version:Content-Type:Content-Transfer-Encoding; b=iMZkwdoCZQh8HWYv5OIhR3YOXBCKWniuUEG04j5JndgVFGclZH/cxhMp5qZNPgjBG tkQ8CFvo1VJH7bRXiIFn6JsfN6M7aocjDXE4zAfV3PL77EAoWMopd8InecWcggQMGb G/KyXlKw/sbJ3MWUV3aSc7rBxbm1BdGtwBJnDwXBzdNtbYbpBS5NqLduBh7E/Sqd0N VrgH/F3CyC9bgkmQjxAMKDrfJs0Dod28RMFOKp9NVmud4B9FWyqCP8sa+fRQMHQzEg Y6Z+jw/PppHjycOBkUXmzMtPVVrkFknw45+fxBk4LcyVmrenW6UnelRoxxeiXR8SuE 3+i0n8awUcE/A== Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5M7DeER071507; Tue, 22 Jun 2010 09:13:40 +0200 (CEST) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Tue, 22 Jun 2010 09:13:40 +0200 Message-ID: <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> Date: Tue, 22 Jun 2010 09:13:40 +0200 From: Alexander Leidinger To: Kostik Belousov References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100621184928.GI13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: EA8FA844042.A6CC9 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=-1.1, required 6, autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10, DKIM_VALID -0.10, DKIM_VALID_AU -0.10) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1277795625.41418@tFTYTAbB3VzeSsOT1qqv4g X-EBL-Spam-Status: No Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 07:13:48 -0000 Quoting Kostik Belousov (from Mon, 21 Jun 2010 21:49:28 +0300): > Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > Current code allocates another object of type OBJT_VNODE, assigned > to vp->v_object, to satisfy VM interface for mapping the file, using > vnode_create_vobject. The objects do not share the pages (I do not think > this can be easily achieved without serious changes to VM). Thus most, > if not all, the data is present in two sets of pages. > > When such file is written to, tmpfs copies user buffer both to the swap > object, and to the v_object. > > Patch I posted assigns the swap object to the vp->v_object. I had to > make small change to vm_mmap_vnode() to not allocate the vnode pager > and to not increment vnode use counter when v_object is the swap > object. Did you measure the performance before/after? If not, what are your performance expectations? I don't expect we get double the performance, but if every data of a write is copied twice, I would guess there is a measurable benefit. Bye, Alexander. -- At work, the authority of a person is inversely proportional to the number of pens that person is carrying. http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 08:10:11 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 462D71065670; Tue, 22 Jun 2010 08:10:11 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id A691F8FC13; Tue, 22 Jun 2010 08:10:09 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5M8A6Kk027434 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jun 2010 11:10:06 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5M8A506024451; Tue, 22 Jun 2010 11:10:05 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5M8A5SB024450; Tue, 22 Jun 2010 11:10:05 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 22 Jun 2010 11:10:05 +0300 From: Kostik Belousov To: Alexander Leidinger Message-ID: <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="XxErtE42FmiGaS4u" Content-Disposition: inline In-Reply-To: <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 08:10:11 -0000 --XxErtE42FmiGaS4u Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 22, 2010 at 09:13:40AM +0200, Alexander Leidinger wrote: > Quoting Kostik Belousov (from Mon, 21 Jun 2010 =20 > 21:49:28 +0300): >=20 > >Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > >Current code allocates another object of type OBJT_VNODE, assigned > >to vp->v_object, to satisfy VM interface for mapping the file, using > >vnode_create_vobject. The objects do not share the pages (I do not think > >this can be easily achieved without serious changes to VM). Thus most, > >if not all, the data is present in two sets of pages. > > > >When such file is written to, tmpfs copies user buffer both to the swap > >object, and to the v_object. > > > >Patch I posted assigns the swap object to the vp->v_object. I had to > >make small change to vm_mmap_vnode() to not allocate the vnode pager > >and to not increment vnode use counter when v_object is the swap > >object. >=20 > Did you measure the performance before/after? If not, what are your =20 > performance expectations? I don't expect we get double the =20 > performance, but if every data of a write is copied twice, I would =20 > guess there is a measurable benefit. No, I did not bothered. Real benefit of the change is the memory saving. --XxErtE42FmiGaS4u Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwgb90ACgkQC3+MBN1Mb4hcjACgvCN4lP/CaDtJXtaSzu5mfZr7 ohUAnj3IZNduld24zY8kys7dZ4OQBahg =IicH -----END PGP SIGNATURE----- --XxErtE42FmiGaS4u-- From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 08:10:11 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 462D71065670; Tue, 22 Jun 2010 08:10:11 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id A691F8FC13; Tue, 22 Jun 2010 08:10:09 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5M8A6Kk027434 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jun 2010 11:10:06 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5M8A506024451; Tue, 22 Jun 2010 11:10:05 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5M8A5SB024450; Tue, 22 Jun 2010 11:10:05 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 22 Jun 2010 11:10:05 +0300 From: Kostik Belousov To: Alexander Leidinger Message-ID: <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="XxErtE42FmiGaS4u" Content-Disposition: inline In-Reply-To: <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 08:10:11 -0000 --XxErtE42FmiGaS4u Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 22, 2010 at 09:13:40AM +0200, Alexander Leidinger wrote: > Quoting Kostik Belousov (from Mon, 21 Jun 2010 =20 > 21:49:28 +0300): >=20 > >Tmpfs uses OBJT_SWAP object to keep the data pages for the files. > >Current code allocates another object of type OBJT_VNODE, assigned > >to vp->v_object, to satisfy VM interface for mapping the file, using > >vnode_create_vobject. The objects do not share the pages (I do not think > >this can be easily achieved without serious changes to VM). Thus most, > >if not all, the data is present in two sets of pages. > > > >When such file is written to, tmpfs copies user buffer both to the swap > >object, and to the v_object. > > > >Patch I posted assigns the swap object to the vp->v_object. I had to > >make small change to vm_mmap_vnode() to not allocate the vnode pager > >and to not increment vnode use counter when v_object is the swap > >object. >=20 > Did you measure the performance before/after? If not, what are your =20 > performance expectations? I don't expect we get double the =20 > performance, but if every data of a write is copied twice, I would =20 > guess there is a measurable benefit. No, I did not bothered. Real benefit of the change is the memory saving. --XxErtE42FmiGaS4u Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwgb90ACgkQC3+MBN1Mb4hcjACgvCN4lP/CaDtJXtaSzu5mfZr7 ohUAnj3IZNduld24zY8kys7dZ4OQBahg =IicH -----END PGP SIGNATURE----- --XxErtE42FmiGaS4u-- From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 08:30:08 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF412106566C; Tue, 22 Jun 2010 08:30:08 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 38E2A8FC1E; Tue, 22 Jun 2010 08:30:08 +0000 (UTC) Received: from outgoing.leidinger.net (pD9E2C147.dip.t-dialin.net [217.226.193.71]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id E3E9B844042; Tue, 22 Jun 2010 10:30:04 +0200 (CEST) Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102]) by outgoing.leidinger.net (Postfix) with ESMTP id AC4A2546D; Tue, 22 Jun 2010 10:30:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1277195401; bh=FBlOADY0t07r3Zl2Elc79ztGJ4XtUjCZOzT9pf1qKos=; h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To: MIME-Version:Content-Type:Content-Transfer-Encoding; b=AG8FAUAIfI9z7bw3RsHieCDeKty6NpCKyr7i3OSJhDSNaCT9UwA6ENMcpGceC8vhE sSbn25KLQyryq09Pqf68wYJo37d5dORrBTMxhYS8eJjE/hsw7OTfo4l6R+3IX8fI1O MZJFpem7Ca7zTYZVC68qA6uFWVYGdUa8CUBY352XVHPzox5Uv3ywZGXU4rR5ik7bod FKkSxBnzOFGj1nNd4lLq3KTVwQ21+SNN2CPZ4M/BqsWYjGoDHWKwtRj/3mNmlwJyO5 QkuqM0ZlISTyqiGbQIoxmMbgL0Nql511YGw6HwbfTuSuVOQtt4hBaHak3r2sIL8/rL F3ingWAXgVd8g== Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5M8U1Rs089212; Tue, 22 Jun 2010 10:30:01 +0200 (CEST) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Tue, 22 Jun 2010 10:30:01 +0200 Message-ID: <20100622103001.12481jemueuswkn4@webmail.leidinger.net> Date: Tue, 22 Jun 2010 10:30:01 +0200 From: Alexander Leidinger To: Kostik Belousov References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: E3E9B844042.A6975 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=-1.023, required 6, autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10, DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_FS 0.08) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1277800206.38467@V8IP+APZLx1uL29TZcNkOw X-EBL-Spam-Status: No Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 08:30:08 -0000 Quoting Kostik Belousov (from Tue, 22 Jun 2010 11:10:05 +0300): > On Tue, Jun 22, 2010 at 09:13:40AM +0200, Alexander Leidinger wrote: >> Did you measure the performance before/after? If not, what are your >> performance expectations? I don't expect we get double the >> performance, but if every data of a write is copied twice, I would >> guess there is a measurable benefit. > No, I did not bothered. Real benefit of the change is the memory saving. For me the real benefit is that it survives a fsx run now. Anyone can buy more money and faster machines, but stability... This does not mean I do not appreciate the memory saving (when the change hits one of my machines, I may decide to use tmpfs in places where I didn't use it before because of memory size concerns). That being said, I'm sure that mentioning the performance aspect additionally to the fsx and memory parts may be good in the release notes (and/or a blog/whatever post of someone). Bye, Alexander. -- When I demanded of my friend what viands he preferred, He quoth: "A large cold bottle, and a small hot bird!" -- Eugene Field, "The Bottle and the Bird" http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 08:30:08 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AF412106566C; Tue, 22 Jun 2010 08:30:08 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 38E2A8FC1E; Tue, 22 Jun 2010 08:30:08 +0000 (UTC) Received: from outgoing.leidinger.net (pD9E2C147.dip.t-dialin.net [217.226.193.71]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id E3E9B844042; Tue, 22 Jun 2010 10:30:04 +0200 (CEST) Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102]) by outgoing.leidinger.net (Postfix) with ESMTP id AC4A2546D; Tue, 22 Jun 2010 10:30:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1277195401; bh=FBlOADY0t07r3Zl2Elc79ztGJ4XtUjCZOzT9pf1qKos=; h=Message-ID:Date:From:To:Cc:Subject:References:In-Reply-To: MIME-Version:Content-Type:Content-Transfer-Encoding; b=AG8FAUAIfI9z7bw3RsHieCDeKty6NpCKyr7i3OSJhDSNaCT9UwA6ENMcpGceC8vhE sSbn25KLQyryq09Pqf68wYJo37d5dORrBTMxhYS8eJjE/hsw7OTfo4l6R+3IX8fI1O MZJFpem7Ca7zTYZVC68qA6uFWVYGdUa8CUBY352XVHPzox5Uv3ywZGXU4rR5ik7bod FKkSxBnzOFGj1nNd4lLq3KTVwQ21+SNN2CPZ4M/BqsWYjGoDHWKwtRj/3mNmlwJyO5 QkuqM0ZlISTyqiGbQIoxmMbgL0Nql511YGw6HwbfTuSuVOQtt4hBaHak3r2sIL8/rL F3ingWAXgVd8g== Received: (from www@localhost) by webmail.leidinger.net (8.14.4/8.13.8/Submit) id o5M8U1Rs089212; Tue, 22 Jun 2010 10:30:01 +0200 (CEST) (envelope-from Alexander@Leidinger.net) Received: from pslux.ec.europa.eu (pslux.ec.europa.eu [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Tue, 22 Jun 2010 10:30:01 +0200 Message-ID: <20100622103001.12481jemueuswkn4@webmail.leidinger.net> Date: Tue, 22 Jun 2010 10:30:01 +0200 From: Alexander Leidinger To: Kostik Belousov References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> In-Reply-To: <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Dynamic Internet Messaging Program (DIMP) H3 (1.1.4) X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: E3E9B844042.A6975 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=-1.023, required 6, autolearn=disabled, ALL_TRUSTED -1.00, DKIM_SIGNED 0.10, DKIM_VALID -0.10, DKIM_VALID_AU -0.10, TW_FS 0.08) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1277800206.38467@V8IP+APZLx1uL29TZcNkOw X-EBL-Spam-Status: No Cc: freebsd-fs@freebsd.org, alc@freebsd.org, fs@freebsd.org, pho@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 08:30:08 -0000 Quoting Kostik Belousov (from Tue, 22 Jun 2010 11:10:05 +0300): > On Tue, Jun 22, 2010 at 09:13:40AM +0200, Alexander Leidinger wrote: >> Did you measure the performance before/after? If not, what are your >> performance expectations? I don't expect we get double the >> performance, but if every data of a write is copied twice, I would >> guess there is a measurable benefit. > No, I did not bothered. Real benefit of the change is the memory saving. For me the real benefit is that it survives a fsx run now. Anyone can buy more money and faster machines, but stability... This does not mean I do not appreciate the memory saving (when the change hits one of my machines, I may decide to use tmpfs in places where I didn't use it before because of memory size concerns). That being said, I'm sure that mentioning the performance aspect additionally to the fsx and memory parts may be good in the release notes (and/or a blog/whatever post of someone). Bye, Alexander. -- When I demanded of my friend what viands he preferred, He quoth: "A large cold bottle, and a small hot bird!" -- Eugene Field, "The Bottle and the Bird" http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 09:51:16 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC46D1065670; Tue, 22 Jun 2010 09:51:16 +0000 (UTC) (envelope-from gleb.kurtsou@gmail.com) Received: from mail-bw0-f54.google.com (mail-bw0-f54.google.com [209.85.214.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1BEF38FC08; Tue, 22 Jun 2010 09:51:15 +0000 (UTC) Received: by bwz8 with SMTP id 8so1967681bwz.13 for ; Tue, 22 Jun 2010 02:51:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=LLE8Po9oFLHOLEEUyJAVsaIJ3xQ++kTmTNUzRP9VOqk=; b=f5YnnUdNclJ0DpTqdoZVQopLWfLQ3Zvn9v35ekBcLkz+zzQ1Y06nowgMh289aeFUmh gUThkdx3i/HlgUo04a0sq1p0+Gh4183UCb4AscO26JsHi+tIg07FIeLvh1CN0x2u+sAh KLaai1zKByPxzRqE+fQjwSZBXrxroa3wx+qKM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=XpUU5Tc2LvE2coYIU9Ghx74+QWu2GwNq90Iic5Vzc5pKaB5BZqlnjL5QA0sSYDQeS4 j7V62ksdFEQkwjs1k2Znp/PyjAz2thSBGFHMGuPyo6bAgma53Ha3Hx7JBuHqJyI4+L70 /chHVNCphIxNApTSlHjEVk0M/Y8L3ySoOCyWw= Received: by 10.204.73.195 with SMTP id r3mr3932255bkj.147.1277198438387; Tue, 22 Jun 2010 02:20:38 -0700 (PDT) Received: from localhost ([212.98.186.134]) by mx.google.com with ESMTPS id jx10sm9689320bkb.33.2010.06.22.02.20.36 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 22 Jun 2010 02:20:37 -0700 (PDT) Date: Tue, 22 Jun 2010 12:20:44 +0300 From: Gleb Kurtsou To: Kostik Belousov Message-ID: <20100622092044.GA2958@tops> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: alc@freebsd.org, pho@freebsd.org, fs@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 09:51:16 -0000 On (21/06/2010 15:58), Kostik Belousov wrote: > Hi, > Below is the patch that eliminates second copy of the data kept by tmpfs > in case a file is mapped. Also, it removes potential deadlocks due to > tmpfs doing copyin/out while page is busy. It is possible that patch > also fixes known issue with sendfile(2) of tmpfs file, but I did not > verified this. What is that sendfile issue you refer to? I though it was fixed: http://svn.freebsd.org/viewvc/base?view=revision&revision=197850 > > Patch essentially consists of three parts: > - move of vm_object' vnp_size from the type-discriminated union to the > vm_object proper; > - making vm not choke when vm object held in the struct vnode' v_object > is default or swap object instead of vnode object; > - use of the swap object that keeps data for tmpfs VREG file, also as > v_object. > > Peter Holm helped me with the patch, apparently we survive fsx and stress2. There is race issue in tmpfs_rename(). It can be easily provoked with blogbench. I don't remember details but it seems node from fdvp can disappear during call. Probably UFS-style exclusively lock all vnodes in VOP_RENAME() should work great for tmpfs. If you have ideas on how to fix it I'll find time to work on it, it's somewhat related to my dircache project. Thanks, Gleb. From owner-freebsd-fs@FreeBSD.ORG Tue Jun 22 10:35:10 2010 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0013E106566B; Tue, 22 Jun 2010 10:35:09 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 2C6BD8FC08; Tue, 22 Jun 2010 10:35:08 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id o5MAZ4H4041152 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 22 Jun 2010 13:35:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id o5MAZ4pm077578; Tue, 22 Jun 2010 13:35:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id o5MAZ4Gh077568; Tue, 22 Jun 2010 13:35:04 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 22 Jun 2010 13:35:04 +0300 From: Kostik Belousov To: Gleb Kurtsou Message-ID: <20100622103504.GW13238@deviant.kiev.zoral.com.ua> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <20100622092044.GA2958@tops> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="CN3FGaqzfdtlPLTf" Content-Disposition: inline In-Reply-To: <20100622092044.GA2958@tops> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-2.3 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_50, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: alc@freebsd.org, pho@freebsd.org, fs@freebsd.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 22 Jun 2010 10:35:10 -0000 --CN3FGaqzfdtlPLTf Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Jun 22, 2010 at 12:20:44PM +0300, Gleb Kurtsou wrote: > On (21/06/2010 15:58), Kostik Belousov wrote: > > Hi, > > Below is the patch that eliminates second copy of the data kept by tmpfs > > in case a file is mapped. Also, it removes potential deadlocks due to > > tmpfs doing copyin/out while page is busy. It is possible that patch > > also fixes known issue with sendfile(2) of tmpfs file, but I did not > > verified this. > What is that sendfile issue you refer to? I though it was fixed: > http://svn.freebsd.org/viewvc/base?view=3Drevision&revision=3D197850 As I said, I did not looked at the issue, and did not verified it. >=20 > >=20 > > Patch essentially consists of three parts: > > - move of vm_object' vnp_size from the type-discriminated union to the > > vm_object proper; > > - making vm not choke when vm object held in the struct vnode' v_object > > is default or swap object instead of vnode object; > > - use of the swap object that keeps data for tmpfs VREG file, also as > > v_object. > >=20 > > Peter Holm helped me with the patch, apparently we survive fsx and stre= ss2. > There is race issue in tmpfs_rename(). It can be easily provoked with > blogbench. I don't remember details but it seems node from fdvp can > disappear during call. Probably UFS-style exclusively lock all vnodes in > VOP_RENAME() should work great for tmpfs. If you have ideas on how to > fix it I'll find time to work on it, it's somewhat related to my > dircache project. I believe there is still an issue in tmpfs_lookup(), but I did not looked at it for long time. --CN3FGaqzfdtlPLTf Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (FreeBSD) iEYEARECAAYFAkwgkdgACgkQC3+MBN1Mb4g6BACeKQm0DPZxlqRALx3HG48m5Mna P24AoOFdhp8bcxSqenmWQQRI3UTBQt6m =eKet -----END PGP SIGNATURE----- --CN3FGaqzfdtlPLTf-- From owner-freebsd-fs@FreeBSD.ORG Wed Jun 23 11:58:34 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 89A2C106566C for ; Wed, 23 Jun 2010 11:58:34 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail07.syd.optusnet.com.au (mail07.syd.optusnet.com.au [211.29.132.188]) by mx1.freebsd.org (Postfix) with ESMTP id 26BA58FC17 for ; Wed, 23 Jun 2010 11:58:33 +0000 (UTC) Received: from c122-106-145-229.carlnfd1.nsw.optusnet.com.au (c122-106-145-229.carlnfd1.nsw.optusnet.com.au [122.106.145.229]) by mail07.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o5NBwOZg029166 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Jun 2010 21:58:26 +1000 Date: Wed, 23 Jun 2010 21:58:24 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Bruce Cran In-Reply-To: <201006211847.27788.bruce@cran.org.uk> Message-ID: <20100623213423.Y45477@delplex.bde.org> References: <201006211847.27788.bruce@cran.org.uk> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: Patch to fix reported mountpoint when mounting dirty filesystem X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jun 2010 11:58:34 -0000 On Mon, 21 Jun 2010, Bruce Cran wrote: > I've been investigating PR bin/19683 which is about the fact that when a R/W > mount is denied due to the filesystem being dirty, the reported mountpoint is > that which has been recorded in the superblock, not the directory that it's > currently being mounted on. I've attached a potential patch but I'm not sure > if f_mntonname is valid at the first print statement. I think it is, but this is unclear. Even if it is always initialized, it might be the wrong name to use. (ISTR cases where MNT_RELOAD and/or MNT_UPDATE changes names from one alias to another. Usually the new name is better but this is not clear. Anyway, all uses of the name except the final recording of it in the superblock involve an error, and it is hard to think of a case where the new name is not relevant (the old name might be relevant too).) The old name is used in several other contexts that probably need the same change. E.g., in the error message just after the ones that you changed. If none are left that actually need to know the old name, then it would be clearer to change the old name in the superblock at the beginning of ffs_mount() instead of at the end of ffs_mountfs(). I think the change is discarded unless the mount succeeds. > Also, I tried to > replicate the problem with ext2fs but the despite code being present which > prints the mountpoint, that code was never hit so I'm not sure if it needs > updated too? ext2fs has the same problem. Other clones of ffs that keep the old name in the superblock probably have the same problem. Several read-only clones of ffs don't have the problem since they don't write so the can't have a writable name in a superblock. BTW, ffs_mountfs() is now bogus, and its comment "/* common code for mount and mountroot */ is more bogus. `mountroot' no longer exists so ffs_mountfs() cannot be common for it. ffs_mountfs() is now only called once near the end of ffs_mount() where it could easily be inlined. Its splitting apparently dates from when it was common code for ffs and mfs. Bruce From owner-freebsd-fs@FreeBSD.ORG Wed Jun 23 13:48:08 2010 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 78E891065800; Wed, 23 Jun 2010 13:48:08 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 08D1F8FC15; Wed, 23 Jun 2010 13:48:07 +0000 (UTC) Received: from c122-106-145-229.carlnfd1.nsw.optusnet.com.au (c122-106-145-229.carlnfd1.nsw.optusnet.com.au [122.106.145.229]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o5NDm3tI023159 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Jun 2010 23:48:05 +1000 Date: Wed, 23 Jun 2010 23:48:03 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Alexander Leidinger In-Reply-To: <20100622103001.12481jemueuswkn4@webmail.leidinger.net> Message-ID: <20100623233917.N45555@delplex.bde.org> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> <20100622103001.12481jemueuswkn4@webmail.leidinger.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, alc@FreeBSD.org, fs@FreeBSD.org, pho@FreeBSD.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jun 2010 13:48:08 -0000 On Tue, 22 Jun 2010, Alexander Leidinger wrote: > Quoting Kostik Belousov (from Tue, 22 Jun 2010 11:10:05 > +0300): > >> On Tue, Jun 22, 2010 at 09:13:40AM +0200, Alexander Leidinger wrote: > >>> Did you measure the performance before/after? If not, what are your >>> performance expectations? I don't expect we get double the >>> performance, but if every data of a write is copied twice, I would >>> guess there is a measurable benefit. >> No, I did not bothered. Real benefit of the change is the memory saving. > > For me the real benefit is that it survives a fsx run now. Anyone can buy > more money and faster machines, but stability... It's not so easy to buy machines enough faster to compensate from thrashing of caches caused by extra memory accesses. > This does not mean I do not appreciate the memory saving (when the change > hits one of my machines, I may decide to use tmpfs in places where I didn't > use it before because of memory size concerns). > > That being said, I'm sure that mentioning the performance aspect additionally > to the fsx and memory parts may be good in the release notes (and/or a > blog/whatever post of someone). How much performance does it give anyway? I would guess a negative amount compared with a an async mounted ffs, at least if it double buffers everything, since the double buffering would halve the amount of memory available for caching files. Bruce From owner-freebsd-fs@FreeBSD.ORG Wed Jun 23 13:48:08 2010 Return-Path: Delivered-To: fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 78E891065800; Wed, 23 Jun 2010 13:48:08 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail06.syd.optusnet.com.au (mail06.syd.optusnet.com.au [211.29.132.187]) by mx1.freebsd.org (Postfix) with ESMTP id 08D1F8FC15; Wed, 23 Jun 2010 13:48:07 +0000 (UTC) Received: from c122-106-145-229.carlnfd1.nsw.optusnet.com.au (c122-106-145-229.carlnfd1.nsw.optusnet.com.au [122.106.145.229]) by mail06.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o5NDm3tI023159 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 23 Jun 2010 23:48:05 +1000 Date: Wed, 23 Jun 2010 23:48:03 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Alexander Leidinger In-Reply-To: <20100622103001.12481jemueuswkn4@webmail.leidinger.net> Message-ID: <20100623233917.N45555@delplex.bde.org> References: <20100621125825.GG13238@deviant.kiev.zoral.com.ua> <201006211030.55327.jhb@freebsd.org> <20100621184928.GI13238@deviant.kiev.zoral.com.ua> <20100622091340.25034svc6uz3k4g0@webmail.leidinger.net> <20100622081005.GQ13238@deviant.kiev.zoral.com.ua> <20100622103001.12481jemueuswkn4@webmail.leidinger.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@FreeBSD.org, alc@FreeBSD.org, fs@FreeBSD.org, pho@FreeBSD.org Subject: Re: Tmpfs elimination of double-copy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Jun 2010 13:48:08 -0000 On Tue, 22 Jun 2010, Alexander Leidinger wrote: > Quoting Kostik Belousov (from Tue, 22 Jun 2010 11:10:05 > +0300): > >> On Tue, Jun 22, 2010 at 09:13:40AM +0200, Alexander Leidinger wrote: > >>> Did you measure the performance before/after? If not, what are your >>> performance expectations? I don't expect we get double the >>> performance, but if every data of a write is copied twice, I would >>> guess there is a measurable benefit. >> No, I did not bothered. Real benefit of the change is the memory saving. > > For me the real benefit is that it survives a fsx run now. Anyone can buy > more money and faster machines, but stability... It's not so easy to buy machines enough faster to compensate from thrashing of caches caused by extra memory accesses. > This does not mean I do not appreciate the memory saving (when the change > hits one of my machines, I may decide to use tmpfs in places where I didn't > use it before because of memory size concerns). > > That being said, I'm sure that mentioning the performance aspect additionally > to the fsx and memory parts may be good in the release notes (and/or a > blog/whatever post of someone). How much performance does it give anyway? I would guess a negative amount compared with a an async mounted ffs, at least if it double buffers everything, since the double buffering would halve the amount of memory available for caching files. Bruce From owner-freebsd-fs@FreeBSD.ORG Fri Jun 25 23:17:12 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C76F4106575B for ; Fri, 25 Jun 2010 23:17:12 +0000 (UTC) (envelope-from peterjeremy@acm.org) Received: from mail17.syd.optusnet.com.au (mail17.syd.optusnet.com.au [211.29.132.198]) by mx1.freebsd.org (Postfix) with ESMTP id 5A5818FC1A for ; Fri, 25 Jun 2010 23:17:11 +0000 (UTC) Received: from server.vk2pj.dyndns.org (c211-30-160-13.belrs4.nsw.optusnet.com.au [211.30.160.13]) by mail17.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id o5PNH90V002065 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Sat, 26 Jun 2010 09:17:10 +1000 X-Bogosity: Ham, spamicity=0.000000 Received: from server.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by server.vk2pj.dyndns.org (8.14.4/8.14.4) with ESMTP id o5PNH8O2037205 for ; Sat, 26 Jun 2010 09:17:08 +1000 (EST) (envelope-from peter@server.vk2pj.dyndns.org) Received: (from peter@localhost) by server.vk2pj.dyndns.org (8.14.4/8.14.4/Submit) id o5PNH8jD037204 for freebsd-fs@freebsd.org; Sat, 26 Jun 2010 09:17:08 +1000 (EST) (envelope-from peter) Date: Sat, 26 Jun 2010 09:17:08 +1000 From: Peter Jeremy To: freebsd-fs@freebsd.org Message-ID: <20100625231708.GB29793@server.vk2pj.dyndns.org> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="RnlQjJ0d97Da+TV1" Content-Disposition: inline X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.20 (2009-06-14) Subject: mdconfig on ZFS leaks disk space X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 25 Jun 2010 23:17:12 -0000 --RnlQjJ0d97Da+TV1 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable I recently did a quick experiment to create an 8TB UFS filesystem via mdconfig and after destroying the md and deleting the file, the disk space used by the md was not returned - even after a reboot. Has anyone else seen this? I was using a 8.1-prelease/amd64 with everything on ZFS v14 and did: # truncate -s 8T /tmp/space # mdconfig -a -t vnode -f /tmp/space # newfs /dev/md0 /dev/md0: 8388608.0MB (17179869184 sectors) block size 16384, fragment size= 2048 using 45661 cylinder groups of 183.72MB, 11758 blks, 23552 inodes. This occupied ~450MB on /tmp which uses lzjb compression. # fsck -t ufs /dev/md0 needed ~550MB VSZ and had ~530MB resident by the end. # mount /dev/md0 /mnt # df -k /mnt /dev/md0 8319620678 4 7654051020 0% 2 1075407868 0% /mnt I then copied a random collection of files into /mnt, boosting the size of /tmp/space to ~880MB. # umount /mnt # fsck -t ufs /dev/md0 # mdconfig -d -u 0 # rm /tmp/space At this point, 'df' on /tmp reported 881MB used whilst 'du' on /tmp report 1MB used. lsof showed no references to the space. Whilst there were snapshots of /tmp, none had been taken since /tmp/space was created. I deleted them anyway to no effect. Rebooting the system had no effect. I eventually recovered the space by doing a "zfs destroy zroot/tmp" and re-creating it. This showed the in the pool increased by exactly the amount of extraneous space. --=20 Peter Jeremy --RnlQjJ0d97Da+TV1 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAkwlOPQACgkQ/opHv/APuIeDpQCgk13kJD/l+/lr2Xj5naz1Pv0l sMAAn1rZQJU12px54f9uvLEFpUpouvsZ =ETd9 -----END PGP SIGNATURE----- --RnlQjJ0d97Da+TV1-- From owner-freebsd-fs@FreeBSD.ORG Sat Jun 26 12:22:27 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4EAAE106564A for ; Sat, 26 Jun 2010 12:22:27 +0000 (UTC) (envelope-from freebsd-listen@fabiankeil.de) Received: from smtprelay03.ispgateway.de (smtprelay03.ispgateway.de [80.67.29.28]) by mx1.freebsd.org (Postfix) with ESMTP id D34408FC12 for ; Sat, 26 Jun 2010 12:22:26 +0000 (UTC) Received: from [78.34.142.64] (helo=r500.local) by smtprelay03.ispgateway.de with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.68) (envelope-from ) id 1OSUDT-0006Lb-Uq for freebsd-fs@freebsd.org; Sat, 26 Jun 2010 14:10:36 +0200 Date: Sat, 26 Jun 2010 14:10:38 +0200 From: Fabian Keil To: freebsd-fs@freebsd.org Message-ID: <20100626141038.0d9f488a@r500.local> In-Reply-To: <20100625231708.GB29793@server.vk2pj.dyndns.org> References: <20100625231708.GB29793@server.vk2pj.dyndns.org> X-Mailer: Claws Mail 3.7.6 (GTK+ 2.20.1; amd64-portbld-freebsd9.0) X-PGP-KEY-URL: http://www.fabiankeil.de/gpg-keys/freebsd-listen-2008-08-18.asc Mime-Version: 1.0 Content-Type: multipart/signed; micalg=PGP-SHA1; boundary="Sig_/M5Kaz=3BE7Wt0QclwkMFXNC"; protocol="application/pgp-signature" X-Df-Sender: 775067 Subject: Re: mdconfig on ZFS leaks disk space X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jun 2010 12:22:27 -0000 --Sig_/M5Kaz=3BE7Wt0QclwkMFXNC Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable Peter Jeremy wrote: > I recently did a quick experiment to create an 8TB UFS filesystem > via mdconfig and after destroying the md and deleting the file, > the disk space used by the md was not returned - even after a > reboot. Has anyone else seen this? >=20 > I was using a 8.1-prelease/amd64 with everything on ZFS v14 and did: >=20 > # truncate -s 8T /tmp/space > # mdconfig -a -t vnode -f /tmp/space > # newfs /dev/md0 > /dev/md0: 8388608.0MB (17179869184 sectors) block size 16384, fragment si= ze 2048 > using 45661 cylinder groups of 183.72MB, 11758 blks, 23552 inodes. >=20 > This occupied ~450MB on /tmp which uses lzjb compression. >=20 > # fsck -t ufs /dev/md0 > needed ~550MB VSZ and had ~530MB resident by the end. >=20 > # mount /dev/md0 /mnt > # df -k /mnt > /dev/md0 8319620678 4 7654051020 0% 2 1075407868 0% /mnt >=20 > I then copied a random collection of files into /mnt, boosting the > size of /tmp/space to ~880MB. >=20 > # umount /mnt > # fsck -t ufs /dev/md0 > # mdconfig -d -u 0 > # rm /tmp/space >=20 > At this point, 'df' on /tmp reported 881MB used whilst 'du' on /tmp > report 1MB used. lsof showed no references to the space. Whilst > there were snapshots of /tmp, none had been taken since /tmp/space > was created. I deleted them anyway to no effect. I can't reproduce this with Martin Matuska's ZFS v16 patch: fk@r500 /tank/sparse-file-test $df -h ./ Filesystem Size Used Avail Capacity Mounted on tank/sparse-file-test 62G 932M 61G 1% /tank/sparse-file-t= est fk@r500 /tank/sparse-file-test $sudo rm space=20 fk@r500 /tank/sparse-file-test $df -h ./ Filesystem Size Used Avail Capacity Mounted on tank/sparse-file-test 62G 96K 62G 0% /tank/sparse-file-t= est The pool is still v14. I thought I remembered reports on zfs-discuss@ about a known bug with leaked disk space after deleting sparse files that's supposed to be fixed in latter ZFS versions, but so far I only found reports about a similar problem with sparse volumes, so maybe I'm mistaken. Fabian --Sig_/M5Kaz=3BE7Wt0QclwkMFXNC Content-Type: application/pgp-signature; name=signature.asc Content-Disposition: attachment; filename=signature.asc -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.15 (FreeBSD) iEYEARECAAYFAkwl7lIACgkQBYqIVf93VJ0wRgCgy9mO7RzuvnhLNIWwJVZmCx9d 9eEAnA0rw2ppN0O81dfZlM4BhqtJpNg/ =LXO/ -----END PGP SIGNATURE----- --Sig_/M5Kaz=3BE7Wt0QclwkMFXNC-- From owner-freebsd-fs@FreeBSD.ORG Sat Jun 26 16:29:43 2010 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B284F106566C for ; Sat, 26 Jun 2010 16:29:43 +0000 (UTC) (envelope-from mickael.maillot@gmail.com) Received: from mail-ww0-f54.google.com (mail-ww0-f54.google.com [74.125.82.54]) by mx1.freebsd.org (Postfix) with ESMTP id 46D5B8FC12 for ; Sat, 26 Jun 2010 16:29:42 +0000 (UTC) Received: by wwb24 with SMTP id 24so3513682wwb.13 for ; Sat, 26 Jun 2010 09:29:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=yjDEW4PECMGxNjUboHEO163bb8NMr4yq97Nqj/S5d68=; b=bOFTNl29/HHZKc+GPGphRuKs78cGcCCrKgB5TQM5DMZ1CFuVhoQqlSn5sqd69w2F5T e7T3PDUkYZe3zPYKuwW8QqYL4tMoKj+YRI3V2Y1URId3U50uINpDIKHpPfwc5qPrYFD8 pbPLbaPX+VZUFe7Lxt0ZTAgw2H01aAeNP/+e8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=f5MeT5Xf3EhH6p/ZpPgjbZcjria4w9zf4fzcXnIgwTiyCDbxmafw5QTVDaLXRWxaCt 5qCYgYtX4Yr1xuxV5L765DQSsfRNw/t7tAUlSlreY5qPkyI3VCsF6+yj1o4Bxz4fRPGu rKCyrtY9bmu4MdxuIChoUuU0M2HMuHeNT4vtQ= MIME-Version: 1.0 Received: by 10.227.128.213 with SMTP id l21mr1852944wbs.133.1277569781958; Sat, 26 Jun 2010 09:29:41 -0700 (PDT) Received: by 10.216.28.200 with HTTP; Sat, 26 Jun 2010 09:29:41 -0700 (PDT) In-Reply-To: <20100626141038.0d9f488a@r500.local> References: <20100625231708.GB29793@server.vk2pj.dyndns.org> <20100626141038.0d9f488a@r500.local> Date: Sat, 26 Jun 2010 18:29:41 +0200 Message-ID: From: =?ISO-8859-1?Q?Micka=EBl_Maillot?= To: Fabian Keil Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: mdconfig on ZFS leaks disk space X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 26 Jun 2010 16:29:43 -0000 what is your svn rev ? because r208869: Fix freeing space after deleting large files with holes dated: Sun Jun 6 13:08:36 2010 2010/6/26 Fabian Keil : > Peter Jeremy wrote: > >> I recently did a quick experiment to create an 8TB UFS filesystem >> via mdconfig and after destroying the md and deleting the file, >> the disk space used by the md was not returned - even after a >> reboot. =A0Has anyone else seen this? >> >> I was using a 8.1-prelease/amd64 with everything on ZFS v14 and did: >> >> # truncate -s 8T /tmp/space >> # mdconfig -a -t vnode -f /tmp/space >> # newfs /dev/md0 >> /dev/md0: 8388608.0MB (17179869184 sectors) block size 16384, fragment s= ize 2048 >> =A0 =A0 =A0 =A0 using 45661 cylinder groups of 183.72MB, 11758 blks, 235= 52 inodes. >> >> This occupied ~450MB on /tmp which uses lzjb compression. >> >> # fsck -t ufs /dev/md0 >> needed ~550MB VSZ and had ~530MB resident by the end. >> >> # mount /dev/md0 /mnt >> # df -k /mnt >> /dev/md0 =A08319620678 =A04 7654051020 0% =A02 1075407868 =A0 =A00% =A0 = /mnt >> >> I then copied a random collection of files into /mnt, boosting the >> size of /tmp/space to ~880MB. >> >> # umount /mnt >> # fsck -t ufs /dev/md0 >> # mdconfig -d -u 0 >> # rm /tmp/space >> >> At this point, 'df' on /tmp reported 881MB used whilst 'du' on /tmp >> report 1MB used. =A0lsof showed no references to the space. =A0Whilst >> there were snapshots of /tmp, none had been taken since /tmp/space >> was created. =A0I deleted them anyway to no effect. > > I can't reproduce this with Martin Matuska's ZFS v16 patch: > > fk@r500 /tank/sparse-file-test $df -h ./ > Filesystem =A0 =A0 =A0 =A0 =A0 =A0 =A0 Size =A0 =A0Used =A0 Avail Capacit= y =A0Mounted on > tank/sparse-file-test =A0 =A0 62G =A0 =A0932M =A0 =A0 61G =A0 =A0 1% =A0 = =A0/tank/sparse-file-test > fk@r500 /tank/sparse-file-test $sudo rm space > fk@r500 /tank/sparse-file-test $df -h ./ > Filesystem =A0 =A0 =A0 =A0 =A0 =A0 =A0 Size =A0 =A0Used =A0 Avail Capacit= y =A0Mounted on > tank/sparse-file-test =A0 =A0 62G =A0 =A0 96K =A0 =A0 62G =A0 =A0 0% =A0 = =A0/tank/sparse-file-test > > The pool is still v14. > > I thought I remembered reports on zfs-discuss@ about a known bug with > leaked disk space after deleting sparse files that's supposed to be > fixed in latter ZFS versions, but so far I only found reports about > a similar problem with sparse volumes, so maybe I'm mistaken. > > Fabian >