From owner-freebsd-fs@FreeBSD.ORG Sun Dec 13 16:28:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CC48D106566B for ; Sun, 13 Dec 2009 16:28:16 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl [89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 156E38FC14 for ; Sun, 13 Dec 2009 16:28:15 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 1D25745C9F; Sun, 13 Dec 2009 17:28:14 +0100 (CET) Received: from localhost (chello089077043238.chello.pl [89.77.43.238]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id C3CF345C99; Sun, 13 Dec 2009 17:28:08 +0100 (CET) Date: Sun, 13 Dec 2009 17:28:10 +0100 From: Pawel Jakub Dawidek To: Petri Helenius Message-ID: <20091213162810.GB2052@garage.freebsd.pl> References: <4B20C809.1000008@helenius.fi> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="vOmOzSkFvhd7u8Ms" Content-Disposition: inline In-Reply-To: <4B20C809.1000008@helenius.fi> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and reordering drives X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Dec 2009 16:28:16 -0000 --vOmOzSkFvhd7u8Ms Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Dec 10, 2009 at 12:06:01PM +0200, Petri Helenius wrote: >=20 > Hi, >=20 > Could you provide a pointer to the guid-check patch and if possible,=20 > have it in RELENG_8 too? The fix was merged to stable/8 as a part of r200362. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --vOmOzSkFvhd7u8Ms Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFLJRYaForvXbEpPzQRAisHAJ9N7r4thT3p4apqjzxnjy6AqRaMKgCg90OW 1s6fMvkUykzJX0q0RJkclJs= =VHkk -----END PGP SIGNATURE----- --vOmOzSkFvhd7u8Ms-- From owner-freebsd-fs@FreeBSD.ORG Sun Dec 13 16:31:09 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 408E5106568F; Sun, 13 Dec 2009 16:31:09 +0000 (UTC) (envelope-from pjd@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 182C78FC1B; Sun, 13 Dec 2009 16:31:09 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBDGV8rj067452; Sun, 13 Dec 2009 16:31:08 GMT (envelope-from pjd@freefall.freebsd.org) Received: (from pjd@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBDGV8Qh067448; Sun, 13 Dec 2009 16:31:08 GMT (envelope-from pjd) Date: Sun, 13 Dec 2009 16:31:08 GMT Message-Id: <200912131631.nBDGV8Qh067448@freefall.freebsd.org> To: mm@FreeBSD.org, pjd@FreeBSD.org, freebsd-fs@FreeBSD.org, pjd@FreeBSD.org From: pjd@FreeBSD.org Cc: Subject: Re: kern/141355: [zfs] [patch] zfs recv can fail with E2BIG X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 13 Dec 2009 16:31:09 -0000 Synopsis: [zfs] [patch] zfs recv can fail with E2BIG State-Changed-From-To: open->feedback State-Changed-By: pjd State-Changed-When: ndz 13 gru 2009 16:29:51 UTC State-Changed-Why: Is thispatch like that one: http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch There was a report that it was causing a panic. Responsible-Changed-From-To: freebsd-fs->pjd Responsible-Changed-By: pjd Responsible-Changed-When: ndz 13 gru 2009 16:29:51 UTC Responsible-Changed-Why: I'll take this one. http://www.freebsd.org/cgi/query-pr.cgi?pr=141355 From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 11:06:54 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9F8A9106566B for ; Mon, 14 Dec 2009 11:06:54 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 6B6FD8FC1E for ; Mon, 14 Dec 2009 11:06:54 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBEB6sr1075927 for ; Mon, 14 Dec 2009 11:06:54 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBEB6rJo075925 for freebsd-fs@FreeBSD.org; Mon, 14 Dec 2009 11:06:53 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 14 Dec 2009 11:06:53 GMT Message-Id: <200912141106.nBEB6rJo075925@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 11:06:54 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/141387 fs [zfs] [patch] zfs snapshot -r failed because filesyste o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141257 fs [gvinum] No puedo crear RAID5 por SW con gvinum o kern/141235 fs [disklabel] 8.0 no longer provides /dev entries for al o kern/141194 fs [tmpfs] tmpfs treats the size option as mod 2^32 o kern/141177 fs [zfs] fsync() on FIFO causes panic() on zfs o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140853 fs [nfs] [patch] NFSv2 remove calls fail to send error re o kern/140682 fs [netgraph] [panic] random panic in netgraph o kern/140661 fs [zfs] /boot/loader fails to work on a GPT/ZFS-only sys o kern/140640 fs [zfs] snapshot crash o kern/140433 fs [zfs] [panic] panic while replaying ZIL after crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 149 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 15:47:57 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 06DEF1065692 for ; Mon, 14 Dec 2009 15:47:57 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl [89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 2DD7D8FC2C for ; Mon, 14 Dec 2009 15:47:55 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 5C85E45F43; Mon, 14 Dec 2009 16:47:53 +0100 (CET) Received: from localhost (pdawidek.wheel.pl [10.0.1.1]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id DEA0F45F59; Mon, 14 Dec 2009 16:47:47 +0100 (CET) Date: Mon, 14 Dec 2009 16:47:50 +0100 From: Pawel Jakub Dawidek To: Martin Matuska Message-ID: <20091214154750.GF1666@garage.freebsd.pl> References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="Wb5NtZlyOqqy58h0" Content-Disposition: inline In-Reply-To: <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-5.9 required=4.5 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.0.4 Cc: freebsd-fs@freebsd.org, Ronald Klop Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 15:47:57 -0000 --Wb5NtZlyOqqy58h0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Nov 03, 2009 at 12:08:54PM +0100, Borja Marcos wrote: >=20 > On Oct 29, 2009, at 9:51 PM, Pawel Jakub Dawidek wrote: >=20 > >On Wed, Oct 28, 2009 at 09:51:46PM +0100, Ronald Klop wrote: > >>Hi, > >> > >>I'm forwarding this, because there was no answer on freebsd-stable. > >> > >>Does anybody know about this and have some tips on how to solve it? > > > >Could you try this patch: > > > > http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch >=20 > It's caused a panic for me on 8.0-RC2/amd64. Seems a new problem, =20 > never saw a panic in this situation before. >=20 > How to reproduce: With /usr/src and /usr/obj in a dataset, just >=20 > cd /usr/src > make clean >=20 > Instant panic, in less than 20 seconds. >=20 > Trying to get panic information, unfortunately I'm running on VMWare =20 > Fussion and the silly thing doesn't offer the equivalent of a serial =20 > console. Martin, this is the panic report I was refering to. Could you please try to reproduce it? Maybe first with my patch to confirm it is reproducible and then with your patch to confirm it has no such problem? I'd be very grateful if you could do that. I don't want something to go into the tree if there might be a problem with the patch. --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --Wb5NtZlyOqqy58h0 Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFLJl4lForvXbEpPzQRAo44AJ0T3PAw6alWAGD6AhYk7zHZBBTdvQCcC+r6 FEvD2Mi1K1B98P8j+oOkFUY= =aYXU -----END PGP SIGNATURE----- --Wb5NtZlyOqqy58h0-- From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 16:08:38 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CF83E106566C; Mon, 14 Dec 2009 16:08:38 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from proxypop2.sarenet.es (proxypop2.sarenet.es [194.30.0.95]) by mx1.freebsd.org (Postfix) with ESMTP id 88F258FC13; Mon, 14 Dec 2009 16:08:38 +0000 (UTC) Received: from [172.16.1.204] (izaro.sarenet.es [192.148.167.11]) by proxypop2.sarenet.es (Postfix) with ESMTP id 344487351D; Mon, 14 Dec 2009 17:08:37 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=us-ascii From: Borja Marcos In-Reply-To: <20091214154750.GF1666@garage.freebsd.pl> Date: Mon, 14 Dec 2009 17:08:36 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <495F94EF-8F57-440D-8810-F40E40DE69D5@sarenet.es> References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> <20091214154750.GF1666@garage.freebsd.pl> To: Pawel Jakub Dawidek X-Mailer: Apple Mail (2.1077) Cc: freebsd-fs@freebsd.org, Martin Matuska , Ronald Klop Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 16:08:38 -0000 On Dec 14, 2009, at 4:47 PM, Pawel Jakub Dawidek wrote: > On Tue, Nov 03, 2009 at 12:08:54PM +0100, Borja Marcos wrote: >>=20 >> On Oct 29, 2009, at 9:51 PM, Pawel Jakub Dawidek wrote: >>=20 >> It's caused a panic for me on 8.0-RC2/amd64. Seems a new problem, =20 >> never saw a panic in this situation before. >>=20 >> How to reproduce: With /usr/src and /usr/obj in a dataset, just >>=20 >> cd /usr/src >> make clean >>=20 >> Instant panic, in less than 20 seconds. >>=20 >> Trying to get panic information, unfortunately I'm running on VMWare =20= >> Fussion and the silly thing doesn't offer the equivalent of a serial =20= >> console. >=20 > Martin, this is the panic report I was refering to. Could you please = try > to reproduce it? Maybe first with my patch to confirm it is = reproducible > and then with your patch to confirm it has no such problem? > I'd be very grateful if you could do that. I don't want something to = go > into the tree if there might be a problem with the patch. It was me, not Martin :) I will try to reproduce again. By the way, any news about the zfs = receive deadlock when accessing the target dataset? Borja. From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 18:46:40 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 448941065679 for ; Mon, 14 Dec 2009 18:46:40 +0000 (UTC) (envelope-from mail@petecurry.net) Received: from gopher.petecurry.net (gopher.petecurry.net [67.18.187.209]) by mx1.freebsd.org (Postfix) with ESMTP id 1FD738FC1B for ; Mon, 14 Dec 2009 18:46:40 +0000 (UTC) Received: from kiwi.petecurry.net (tx-71-48-169-80.dhcp.embarqhsd.net [71.48.169.80]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: pete) by gopher.petecurry.net (Postfix) with ESMTPSA id 20A0D80A147; Mon, 14 Dec 2009 12:26:54 -0600 (CST) Received: by kiwi.petecurry.net (nbSMTP-1.00) for uid 1001 (using TLSv1/SSLv3 with cipher DHE-RSA-AES256-SHA (256/256 bits)) mail@petecurry.net; Mon, 14 Dec 2009 12:30:24 -0600 (CST) Date: Mon, 14 Dec 2009 12:30:24 -0600 From: Pete Curry To: Pawel Jakub Dawidek Message-ID: <20091214183024.GM17175@kiwi.petecurry.net> References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> <20091214154750.GF1666@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20091214154750.GF1666@garage.freebsd.pl> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 18:46:40 -0000 On Mon, Dec 14, 2009 at 04:47:50PM +0100, Pawel Jakub Dawidek wrote: > On Tue, Nov 03, 2009 at 12:08:54PM +0100, Borja Marcos wrote: > > > > On Oct 29, 2009, at 9:51 PM, Pawel Jakub Dawidek wrote: > > > > >On Wed, Oct 28, 2009 at 09:51:46PM +0100, Ronald Klop wrote: > > >>Hi, > > >> > > >>I'm forwarding this, because there was no answer on freebsd-stable. > > >> > > >>Does anybody know about this and have some tips on how to solve it? > > > > > >Could you try this patch: > > > > > > http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch > > > > It's caused a panic for me on 8.0-RC2/amd64. Seems a new problem, > > never saw a panic in this situation before. > > > > How to reproduce: With /usr/src and /usr/obj in a dataset, just > > > > cd /usr/src > > make clean > > > > Instant panic, in less than 20 seconds. > > > > Trying to get panic information, unfortunately I'm running on VMWare > > Fussion and the silly thing doesn't offer the equivalent of a serial > > console. > > Martin, this is the panic report I was refering to. Could you please try > to reproduce it? Maybe first with my patch to confirm it is reproducible > and then with your patch to confirm it has no such problem? > I'd be very grateful if you could do that. I don't want something to go > into the tree if there might be a problem with the patch. > To add another report, I've been running your patch on one machine since late October/early November, and just installed another one with it a week ago. I haven't gotten any panics on either. I just did a make clean && make buildworld && make buildkernel && make clean (and ran the final make clean 9 times for good measure) on the second machine to try to reproduce the reported panic, but it worked fine. Both machines are running FreeBSD 8.0-STABLE/amd64 with ZFS from -CURRENT as of Dec 4, plus your patch. This patch is essentially mandatory for me to be able to run ZFS, since otherwise I can't backup my ZFS pools... - Pete Curry From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 22:00:35 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 48E061065692 for ; Mon, 14 Dec 2009 22:00:35 +0000 (UTC) (envelope-from mm@FreeBSD.org) Received: from mail.vx.sk (core.vx.sk [188.40.32.143]) by mx1.freebsd.org (Postfix) with ESMTP id C94578FC22 for ; Mon, 14 Dec 2009 22:00:34 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by mail.vx.sk (Postfix) with ESMTP id 21DD61A2DA; Mon, 14 Dec 2009 22:41:37 +0100 (CET) X-Virus-Scanned: amavisd-new at mail.vx.sk Received: from mail.vx.sk ([127.0.0.1]) by localhost (mail.vx.sk [127.0.0.1]) (amavisd-new, port 10024) with LMTP id 6Z5IHuGudQw9; Mon, 14 Dec 2009 22:41:35 +0100 (CET) Received: from [10.9.8.1] (chello089173000055.chello.sk [89.173.0.55]) by mail.vx.sk (Postfix) with ESMTPSA id C24D61A2BC; Mon, 14 Dec 2009 22:41:34 +0100 (CET) Message-ID: <4B26B08E.5000203@FreeBSD.org> Date: Mon, 14 Dec 2009 22:39:26 +0100 From: Martin Matuska User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; sk; rv:1.8.1.23) Gecko/20090812 Lightning/0.9 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0 MIME-Version: 1.0 To: Borja Marcos References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> <20091214154750.GF1666@garage.freebsd.pl> <495F94EF-8F57-440D-8810-F40E40DE69D5@sarenet.es> In-Reply-To: <495F94EF-8F57-440D-8810-F40E40DE69D5@sarenet.es> X-Enigmail-Version: 0.96.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek , Ronald Klop Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 22:00:35 -0000 I was unable to reproduce the panic (with 8.0-RELEASE-p1 + Pawel's patch or with my patch). I can split my patch into two Opensolaris changesets - 8986, that is exactly pjd's patch. The other changeset is 7994. BUG ID 6764159: restore_object() makes a call that can block while having a tx open but not yet committed. So to make life easier, I have split this and use 2 patches (that make together my old patch) a) 6764159_restore_blocking.patch b) zfs_recv_E2BIG.patch I have also encountered a problem with recursive zfs snapshots of previsously transferred datasets. On many of my systems, zfs snapshot -r tank@xyz just did not work with the following error: zfs snapshot -r failed because filesystem was busy Patch links: http://mfsbsd.vx.sk/patches/6764159_restore_blocking.patch http://mfsbsd.vx.sk/patches/6462803_zfs_snapshot_busy.patch http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch Related OpenSolaris links: http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6462803 (zfs snapshot busy) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6764159 (restore_object blocking) http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6801979 (zfs receive E2BIG) I am running all three patches on about 30-40 servers with 8 CPU cores, amd64 and intensive zfs snapshot -r, intense zfs send/receive operations for several days. No panics or other problems by now. Borja Marcos wrote / napísal(a): > On Dec 14, 2009, at 4:47 PM, Pawel Jakub Dawidek wrote: > > >> On Tue, Nov 03, 2009 at 12:08:54PM +0100, Borja Marcos wrote: >> >>> On Oct 29, 2009, at 9:51 PM, Pawel Jakub Dawidek wrote: >>> >>> It's caused a panic for me on 8.0-RC2/amd64. Seems a new problem, >>> never saw a panic in this situation before. >>> >>> How to reproduce: With /usr/src and /usr/obj in a dataset, just >>> >>> cd /usr/src >>> make clean >>> >>> Instant panic, in less than 20 seconds. >>> >>> Trying to get panic information, unfortunately I'm running on VMWare >>> Fussion and the silly thing doesn't offer the equivalent of a serial >>> console. >>> >> Martin, this is the panic report I was refering to. Could you please try >> to reproduce it? Maybe first with my patch to confirm it is reproducible >> and then with your patch to confirm it has no such problem? >> I'd be very grateful if you could do that. I don't want something to go >> into the tree if there might be a problem with the patch. >> > > It was me, not Martin :) > > I will try to reproduce again. By the way, any news about the zfs receive deadlock when accessing the target dataset? > > > > > > Borja. > > > From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 22:43:25 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7442E106568D; Mon, 14 Dec 2009 22:43:25 +0000 (UTC) (envelope-from mattjreimer@gmail.com) Received: from mail-gx0-f218.google.com (mail-gx0-f218.google.com [209.85.217.218]) by mx1.freebsd.org (Postfix) with ESMTP id 1E45B8FC0C; Mon, 14 Dec 2009 22:43:24 +0000 (UTC) Received: by gxk10 with SMTP id 10so3488889gxk.3 for ; Mon, 14 Dec 2009 14:43:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:cc:content-type; bh=JK2crlEySHKpVtSAqAtn3NbktNUrvtsLoM744dTGqoA=; b=gedjIzB87oUzkyynR6bA08joYnTh/nLT6hZH9tmoztlUhbJf+FOq1MwgKpzf0B7euZ WW6sAIkTufr2ytoETQYWfUALSAZIt0OzqXRyoOX5vpcO3FDr0Z7cJqCWMcEQhAWCpmsm y5N6LhAmiFpdkJ9tuGa4jBh3TxDUSyvOlRSkI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=k5A32xU3jat6mXsdHsB5jZuz7cjXyArs3z4nOrfTBotkM4QH4AyS8/enqv1mgOUkw5 q+27kBY6ZpZkCTEOkMvpgnZzUZp0vjpcYzy88sj9od//CHgNtIwj3wH36OthQHtpVcw9 VdPBtUAv2hbkaruRCtlhYuH7EOHIQgDzuUVmU= MIME-Version: 1.0 Received: by 10.150.117.3 with SMTP id p3mr8238515ybc.287.1260830604382; Mon, 14 Dec 2009 14:43:24 -0800 (PST) Date: Mon, 14 Dec 2009 14:43:24 -0800 Message-ID: From: Matt Reimer To: freebsd-fs Content-Type: multipart/mixed; boundary=000e0cd72a504bd44d047ab800e3 Cc: Pawel Jakub Dawidek Subject: PATCH: more efficient raidz memory usage for (gpt)zfsboot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 22:43:25 -0000 --000e0cd72a504bd44d047ab800e3 Content-Type: text/plain; charset=ISO-8859-1 Teach the (gpt)zfsboot and zfsloader raidz code to use its buffers more efficiently. Before this patch, in the worst case memory use would increase exponentially on the number of drives in the raidz vdev. Sponsored by: VPOP Technologies, Inc. Matt Reimer --000e0cd72a504bd44d047ab800e3 Content-Type: application/octet-stream; name="raidz-mem.patch" Content-Disposition: attachment; filename="raidz-mem.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_g37tvwgu0 LS0tIC9zeXMvY2RkbC9ib290L3pmcy96ZnNzdWJyLmMuT1JJRwkyMDA5LTExLTE0IDA4OjE0OjUx LjAwMDAwMDAwMCAtMDgwMAorKysgL3N5cy9jZGRsL2Jvb3QvemZzL3pmc3N1YnIuYwkyMDA5LTEy LTA3IDE1OjI3OjQ5LjAwMDAwMDAwMCAtMDgwMApAQCAtNDU0LDcgKzQ1NCw3IEBACiAKIHN0YXRp YyB2b2lkCiB2ZGV2X3JhaWR6X3JlY29uc3RydWN0X3BxKHJhaWR6X2NvbF90ICpjb2xzLCBpbnQg bnBhcml0eSwgaW50IGFjb2xzLAotICAgIGludCB4LCBpbnQgeSkKKyAgICBpbnQgeCwgaW50IHks IHZvaWQgKnRlbXBfcCwgdm9pZCAqdGVtcF9xKQogewogCXVpbnQ4X3QgKnAsICpxLCAqcHh5LCAq cXh5LCAqeGQsICp5ZCwgdG1wLCBhLCBiLCBhZXhwLCBiZXhwOwogCXZvaWQgKnBkYXRhLCAqcWRh dGE7CkBAIC00NzgsMTAgKzQ3OCw4IEBACiAJeHNpemUgPSBjb2xzW3hdLnJjX3NpemU7CiAJeXNp emUgPSBjb2xzW3ldLnJjX3NpemU7CiAKLQljb2xzW1ZERVZfUkFJRFpfUF0ucmNfZGF0YSA9Ci0J CXpmc19hbGxvY190ZW1wKGNvbHNbVkRFVl9SQUlEWl9QXS5yY19zaXplKTsKLQljb2xzW1ZERVZf UkFJRFpfUV0ucmNfZGF0YSA9Ci0JCXpmc19hbGxvY190ZW1wKGNvbHNbVkRFVl9SQUlEWl9RXS5y Y19zaXplKTsKKwljb2xzW1ZERVZfUkFJRFpfUF0ucmNfZGF0YSA9IHRlbXBfcDsKKwljb2xzW1ZE RVZfUkFJRFpfUV0ucmNfZGF0YSA9IHRlbXBfcTsKIAljb2xzW3hdLnJjX3NpemUgPSAwOwogCWNv bHNbeV0ucmNfc2l6ZSA9IDA7CiAKQEAgLTU1MSw5ICs1NDksMTIgQEAKIAl1aW50NjRfdCBmID0g YiAlIGRjb2xzOwogCXVpbnQ2NF90IG8gPSAoYiAvIGRjb2xzKSA8PCB1bml0X3NoaWZ0OwogCXVp bnQ2NF90IHEsIHIsIGNvZmY7Ci0JaW50IGMsIGMxLCBiYywgY29sLCBhY29scywgZGV2aWR4LCBh c2l6ZSwgbjsKKwlpbnQgYywgYzEsIGJjLCBjb2wsIGFjb2xzLCBkZXZpZHgsIGFzaXplLCBuLCBt YXhfcmNfc2l6ZTsKIAlzdGF0aWMgcmFpZHpfY29sX3QgY29sc1sxNl07CiAJcmFpZHpfY29sX3Qg KnJjLCAqcmMxOworCXZvaWQgKm9yaWcsICpvcmlnMSwgKnRlbXBfcCwgKnRlbXBfcTsKKworCW9y aWcgPSBvcmlnMSA9IHRlbXBfcCA9IHRlbXBfcSA9IE5VTEw7CiAKIAlxID0gcyAvIChkY29scyAt IG5wYXJpdHkpOwogCXIgPSBzIC0gcSAqIChkY29scyAtIG5wYXJpdHkpOwpAQCAtNTYxLDYgKzU2 Miw3IEBACiAKIAlhY29scyA9IChxID09IDAgPyBiYyA6IGRjb2xzKTsKIAlhc2l6ZSA9IDA7CisJ bWF4X3JjX3NpemUgPSAwOwogCQogCWZvciAoYyA9IDA7IGMgPCBhY29sczsgYysrKSB7CiAJCWNv bCA9IGYgKyBjOwpAQCAtNTc3LDYgKzU3OSw4IEBACiAJCWNvbHNbY10ucmNfdHJpZWQgPSAwOwog CQljb2xzW2NdLnJjX3NraXBwZWQgPSAwOwogCQlhc2l6ZSArPSBjb2xzW2NdLnJjX3NpemU7CisJ CWlmIChjb2xzW2NdLnJjX3NpemUgPiBtYXhfcmNfc2l6ZSkKKwkJCW1heF9yY19zaXplID0gY29s c1tjXS5yY19zaXplOwogCX0KIAogCWFzaXplID0gcm91bmR1cChhc2l6ZSwgKG5wYXJpdHkgKyAx KSA8PCB1bml0X3NoaWZ0KTsKQEAgLTc3Nyw4ICs3ODEsMTMgQEAKIAkJCS8vQVNTRVJUKGMgIT0g YWNvbHMpOwogCQkJLy9BU1NFUlQoIXJjLT5yY19za2lwcGVkIHx8IHJjLT5yY19lcnJvciA9PSBF TlhJTyB8fCByYy0+cmNfZXJyb3IgPT0gRVNUQUxFKTsKIAorCQkJaWYgKCF0ZW1wX3ApCisJCQkJ dGVtcF9wID0gemZzX2FsbG9jX3RlbXAobWF4X3JjX3NpemUpOworCQkJaWYgKCF0ZW1wX3EpCisJ CQkJdGVtcF9xID0gemZzX2FsbG9jX3RlbXAobWF4X3JjX3NpemUpOworCiAJCQl2ZGV2X3JhaWR6 X3JlY29uc3RydWN0X3BxKGNvbHMsIG5wYXJpdHksIGFjb2xzLAotCQkJICAgIGMxLCBjKTsKKwkJ CSAgICBjMSwgYywgdGVtcF9wLCB0ZW1wX3EpOwogCiAJCQlpZiAoemlvX2NoZWNrc3VtX2Vycm9y KGJwLCBidWYpID09IDApCiAJCQkJcmV0dXJuICgwKTsKQEAgLTg0NSwxOCArODU0LDEyIEBACiAJ CXJldHVybiAoRUlPKTsKIAl9CiAKLQlhc2l6ZSA9IDA7Ci0JZm9yIChjID0gMDsgYyA8IGFjb2xz OyBjKyspIHsKLQkJcmMgPSAmY29sc1tjXTsKLQkJaWYgKHJjLT5yY19zaXplID4gYXNpemUpCi0J CQlhc2l6ZSA9IHJjLT5yY19zaXplOwotCX0KIAlpZiAoY29sc1tWREVWX1JBSURaX1BdLnJjX2Vy cm9yID09IDApIHsKIAkJLyoKIAkJICogQXR0ZW1wdCB0byByZWNvbnN0cnVjdCB0aGUgZGF0YSBm cm9tIHBhcml0eSBQLgogCQkgKi8KLQkJdm9pZCAqb3JpZzsKLQkJb3JpZyA9IHpmc19hbGxvY190 ZW1wKGFzaXplKTsKKwkJaWYgKCFvcmlnKQorCQkJb3JpZyA9IHpmc19hbGxvY190ZW1wKG1heF9y Y19zaXplKTsKIAkJZm9yIChjID0gbnBhcml0eTsgYyA8IGFjb2xzOyBjKyspIHsKIAkJCXJjID0g JmNvbHNbY107CiAKQEAgLTg3NCw4ICs4NzcsOCBAQAogCQkvKgogCQkgKiBBdHRlbXB0IHRvIHJl Y29uc3RydWN0IHRoZSBkYXRhIGZyb20gcGFyaXR5IFEuCiAJCSAqLwotCQl2b2lkICpvcmlnOwot CQlvcmlnID0gemZzX2FsbG9jX3RlbXAoYXNpemUpOworCQlpZiAoIW9yaWcpCisJCQlvcmlnID0g emZzX2FsbG9jX3RlbXAobWF4X3JjX3NpemUpOwogCQlmb3IgKGMgPSBucGFyaXR5OyBjIDwgYWNv bHM7IGMrKykgewogCQkJcmMgPSAmY29sc1tjXTsKIApAQCAtODk1LDkgKzg5OCwxNCBAQAogCQkv KgogCQkgKiBBdHRlbXB0IHRvIHJlY29uc3RydWN0IHRoZSBkYXRhIGZyb20gYm90aCBQIGFuZCBR LgogCQkgKi8KLQkJdm9pZCAqb3JpZywgKm9yaWcxOwotCQlvcmlnID0gemZzX2FsbG9jX3RlbXAo YXNpemUpOwotCQlvcmlnMSA9IHpmc19hbGxvY190ZW1wKGFzaXplKTsKKwkJaWYgKCFvcmlnKQor CQkJb3JpZyA9IHpmc19hbGxvY190ZW1wKG1heF9yY19zaXplKTsKKwkJaWYgKCFvcmlnMSkKKwkJ CW9yaWcxID0gemZzX2FsbG9jX3RlbXAobWF4X3JjX3NpemUpOworCQlpZiAoIXRlbXBfcCkKKwkJ CXRlbXBfcCA9IHpmc19hbGxvY190ZW1wKG1heF9yY19zaXplKTsKKwkJaWYgKCF0ZW1wX3EpCisJ CQl0ZW1wX3EgPSB6ZnNfYWxsb2NfdGVtcChtYXhfcmNfc2l6ZSk7CiAJCWZvciAoYyA9IG5wYXJp dHk7IGMgPCBhY29scyAtIDE7IGMrKykgewogCQkJcmMgPSAmY29sc1tjXTsKIApAQCAtOTA5LDcg KzkxNyw3IEBACiAJCQkJbWVtY3B5KG9yaWcxLCByYzEtPnJjX2RhdGEsIHJjMS0+cmNfc2l6ZSk7 CiAKIAkJCQl2ZGV2X3JhaWR6X3JlY29uc3RydWN0X3BxKGNvbHMsIG5wYXJpdHksCi0JCQkJICAg IGFjb2xzLCBjLCBjMSk7CisJCQkJICAgIGFjb2xzLCBjLCBjMSwgdGVtcF9wLCB0ZW1wX3EpOwog CiAJCQkJaWYgKHppb19jaGVja3N1bV9lcnJvcihicCwgYnVmKSA9PSAwKQogCQkJCQlyZXR1cm4g KDApOwo= --000e0cd72a504bd44d047ab800e3-- From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 22:46:57 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7EB061065672; Mon, 14 Dec 2009 22:46:57 +0000 (UTC) (envelope-from mattjreimer@gmail.com) Received: from mail-yw0-f172.google.com (mail-yw0-f172.google.com [209.85.211.172]) by mx1.freebsd.org (Postfix) with ESMTP id 1CDCF8FC08; Mon, 14 Dec 2009 22:46:56 +0000 (UTC) Received: by ywh2 with SMTP id 2so3580630ywh.27 for ; Mon, 14 Dec 2009 14:46:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:cc:content-type; bh=wzi18B0FmDr4huMC3VxRporfp1CrVD/VWgUlO5Q/wZk=; b=fP1Xa1hYWzBDeqsXWJ/Q8RlStgGV8jQBGl/cOb3CZ4wz1XqmgikKcn+Vy0nwACeUwq oZwxbZn5FR2Tc80NBqx6MGQsTll4aWK00yKK0NopjUqRjOvdwZqDF+30sZbAETls06Pz iDCJZ5C8DOfjStARjqSKWpewahARgE6302E7g= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=bMCs8MdtxaocoSFfPJe3WjONaeLX6TjEXUx8o7yyQI102NWOWzHskQUwCixPDAJEO5 nvWwZkhf7xrBA6PxTKaad67Z7em+Vjnp9i7kZkMAjfJU7UmScx9QzHz3WBCkBXa/R/sg dXUAN9EkYIFUpo75mnw+aJhWbxE+lbbFuY7c0= MIME-Version: 1.0 Received: by 10.151.131.2 with SMTP id i2mr8359508ybn.56.1260830816488; Mon, 14 Dec 2009 14:46:56 -0800 (PST) Date: Mon, 14 Dec 2009 14:46:56 -0800 Message-ID: From: Matt Reimer To: fs@freebsd.org Content-Type: multipart/mixed; boundary=00504502ba97f04f22047ab80c11 Cc: Pawel Jakub Dawidek Subject: PATCH: teach (gpt)zfsboot, zfsloader to discern vdev status correctly X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 22:46:57 -0000 --00504502ba97f04f22047ab80c11 Content-Type: text/plain; charset=ISO-8859-1 Instead of assuming all vdevs are healthy, check the newest vdev label for each vdev's status. Booting from a degraded vdev should now be more robust. Sponsored by: VPOP Technologies, Inc. Matt Reimer (Note that much of this patch is merely whitespace change due to a block needing to be reindented. I've attached correct-status-nowhitespace.patch to make review easier.) --00504502ba97f04f22047ab80c11 Content-Type: application/octet-stream; name="correct-status.patch" Content-Disposition: attachment; filename="correct-status.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_g37u1j5r0 LS0tIC9zeXMvY2RkbC9ib290L3pmcy96ZnNpbXBsLmguT1JJRwkyMDA5LTExLTIxIDA3OjAyOjM1 LjAwMDAwMDAwMCAtMDgwMAorKysgL3N5cy9jZGRsL2Jvb3QvemZzL3pmc2ltcGwuaAkyMDA5LTEy LTA3IDEzOjUwOjI2LjAwMDAwMDAwMCAtMDgwMApAQCAtNTQ2LDcgKzU0Niw2IEBACiAjZGVmaW5l CVpQT09MX0NPTkZJR19EVEwJCSJEVEwiCiAjZGVmaW5lCVpQT09MX0NPTkZJR19TVEFUUwkJInN0 YXRzIgogI2RlZmluZQlaUE9PTF9DT05GSUdfV0hPTEVfRElTSwkJIndob2xlX2Rpc2siCi0jZGVm aW5lCVpQT09MX0NPTkZJR19PRkZMSU5FCQkib2ZmbGluZSIKICNkZWZpbmUJWlBPT0xfQ09ORklH X0VSUkNPVU5UCQkiZXJyb3JfY291bnQiCiAjZGVmaW5lCVpQT09MX0NPTkZJR19OT1RfUFJFU0VO VAkibm90X3ByZXNlbnQiCiAjZGVmaW5lCVpQT09MX0NPTkZJR19TUEFSRVMJCSJzcGFyZXMiCkBA IC01NTYsNiArNTU1LDE2IEBACiAjZGVmaW5lCVpQT09MX0NPTkZJR19IT1NUTkFNRQkJImhvc3Ru YW1lIgogI2RlZmluZQlaUE9PTF9DT05GSUdfVElNRVNUQU1QCQkidGltZXN0YW1wIiAvKiBub3Qg c3RvcmVkIG9uIGRpc2sgKi8KIAorLyoKKyAqIFRoZSBwZXJzaXN0ZW50IHZkZXYgc3RhdGUgaXMg c3RvcmVkIGFzIHNlcGFyYXRlIHZhbHVlcyByYXRoZXIgdGhhbiBhIHNpbmdsZQorICogJ3ZkZXZf c3RhdGUnIGVudHJ5LiAgVGhpcyBpcyBiZWNhdXNlIGEgZGV2aWNlIGNhbiBiZSBpbiBtdWx0aXBs ZSBzdGF0ZXMsIHN1Y2gKKyAqIGFzIG9mZmxpbmUgYW5kIGRlZ3JhZGVkLgorICovCisjZGVmaW5l IFpQT09MX0NPTkZJR19PRkZMSU5FICAgICAgICAgICAgIm9mZmxpbmUiCisjZGVmaW5lIFpQT09M X0NPTkZJR19GQVVMVEVEICAgICAgICAgICAgImZhdWx0ZWQiCisjZGVmaW5lIFpQT09MX0NPTkZJ R19ERUdSQURFRCAgICAgICAgICAgImRlZ3JhZGVkIgorI2RlZmluZSBaUE9PTF9DT05GSUdfUkVN T1ZFRCAgICAgICAgICAgICJyZW1vdmVkIgorCiAjZGVmaW5lCVZERVZfVFlQRV9ST09UCQkJInJv b3QiCiAjZGVmaW5lCVZERVZfVFlQRV9NSVJST1IJCSJtaXJyb3IiCiAjZGVmaW5lCVZERVZfVFlQ RV9SRVBMQUNJTkcJCSJyZXBsYWNpbmciCkBAIC01ODgsNyArNTk3LDkgQEAKIAlWREVWX1NUQVRF X1VOS05PV04gPSAwLAkvKiBVbmluaXRpYWxpemVkIHZkZXYJCQkqLwogCVZERVZfU1RBVEVfQ0xP U0VELAkvKiBOb3QgY3VycmVudGx5IG9wZW4JCQkqLwogCVZERVZfU1RBVEVfT0ZGTElORSwJLyog Tm90IGFsbG93ZWQgdG8gb3BlbgkJCSovCisgICAgICAgIFZERVZfU1RBVEVfUkVNT1ZFRCwJLyog RXhwbGljaXRseSByZW1vdmVkIGZyb20gc3lzdGVtCSovCiAJVkRFVl9TVEFURV9DQU5UX09QRU4s CS8qIFRyaWVkIHRvIG9wZW4sIGJ1dCBmYWlsZWQJCSovCisgICAgICAgIFZERVZfU1RBVEVfRkFV TFRFRCwJLyogRXh0ZXJuYWwgcmVxdWVzdCB0byBmYXVsdCBkZXZpY2UJKi8KIAlWREVWX1NUQVRF X0RFR1JBREVELAkvKiBSZXBsaWNhdGVkIHZkZXYgd2l0aCB1bmhlYWx0aHkga2lkcwkqLwogCVZE RVZfU1RBVEVfSEVBTFRIWQkvKiBQcmVzdW1lZCBnb29kCQkJKi8KIH0gdmRldl9zdGF0ZV90Owot LS0gL3N5cy9ib290L3pmcy96ZnNpbXBsLmMuT1JJRwkyMDA5LTExLTIxIDA3OjAyOjM1LjAwMDAw MDAwMCAtMDgwMAorKysgL3N5cy9ib290L3pmcy96ZnNpbXBsLmMJMjAwOS0xMi0wNyAxNDozNjoy MC4wMDAwMDAwMDAgLTA4MDAKQEAgLTQwNCw3ICs0MDQsNyBAQAogfQogCiBzdGF0aWMgaW50Ci12 ZGV2X2luaXRfZnJvbV9udmxpc3QoY29uc3QgdW5zaWduZWQgY2hhciAqbnZsaXN0LCB2ZGV2X3Qg Kip2ZGV2cCkKK3ZkZXZfaW5pdF9mcm9tX252bGlzdChjb25zdCB1bnNpZ25lZCBjaGFyICpudmxp c3QsIHZkZXZfdCAqKnZkZXZwLCBpbnQgaXNfbmV3ZXIpCiB7CiAJaW50IHJjOwogCXVpbnQ2NF90 IGd1aWQsIGlkLCBhc2hpZnQsIG5wYXJpdHk7CkBAIC00MTIsNyArNDEyLDggQEAKIAljb25zdCBj aGFyICpwYXRoOwogCXZkZXZfdCAqdmRldiwgKmtpZDsKIAljb25zdCB1bnNpZ25lZCBjaGFyICpr aWRzOwotCWludCBua2lkcywgaTsKKwlpbnQgbmtpZHMsIGksIGlzX25ldzsKKwl1aW50NjRfdCBp c19vZmZsaW5lLCBpc19mYXVsdGVkLCBpc19kZWdyYWRlZCwgaXNfcmVtb3ZlZDsKIAogCWlmIChu dmxpc3RfZmluZChudmxpc3QsIFpQT09MX0NPTkZJR19HVUlELAogCQkJREFUQV9UWVBFX1VJTlQ2 NCwgMCwgJmd1aWQpCkBAIC00MjQsMTcgKzQyNSw2IEBACiAJCXJldHVybiAoRU5PRU5UKTsKIAl9 CiAKLQkvKgotCSAqIEFzc3VtZSB0aGF0IGlmIHdlJ3ZlIHNlZW4gdGhpcyB2ZGV2IHRyZWUgYmVm b3JlLCB0aGlzIG9uZQotCSAqIHdpbGwgYmUgaWRlbnRpY2FsLgotCSAqLwotCXZkZXYgPSB2ZGV2 X2ZpbmQoZ3VpZCk7Ci0JaWYgKHZkZXYpIHsKLQkJaWYgKHZkZXZwKQotCQkJKnZkZXZwID0gdmRl djsKLQkJcmV0dXJuICgwKTsKLQl9Ci0KIAlpZiAoc3RyY21wKHR5cGUsIFZERVZfVFlQRV9NSVJS T1IpCiAJICAgICYmIHN0cmNtcCh0eXBlLCBWREVWX1RZUEVfRElTSykKIAkgICAgJiYgc3RyY21w KHR5cGUsIFZERVZfVFlQRV9SQUlEWikpIHsKQEAgLTQ0Miw0NCArNDMyLDk1IEBACiAJCXJldHVy biAoRUlPKTsKIAl9CiAKLQlpZiAoIXN0cmNtcCh0eXBlLCBWREVWX1RZUEVfTUlSUk9SKSkKLQkJ dmRldiA9IHZkZXZfY3JlYXRlKGd1aWQsIHZkZXZfbWlycm9yX3JlYWQpOwotCWVsc2UgaWYgKCFz dHJjbXAodHlwZSwgVkRFVl9UWVBFX1JBSURaKSkKLQkJdmRldiA9IHZkZXZfY3JlYXRlKGd1aWQs IHZkZXZfcmFpZHpfcmVhZCk7Ci0JZWxzZQotCQl2ZGV2ID0gdmRldl9jcmVhdGUoZ3VpZCwgdmRl dl9kaXNrX3JlYWQpOworCWlzX29mZmxpbmUgPSBpc19yZW1vdmVkID0gaXNfZmF1bHRlZCA9IGlz X2RlZ3JhZGVkID0gMDsKKworCW52bGlzdF9maW5kKG52bGlzdCwgWlBPT0xfQ09ORklHX09GRkxJ TkUsIERBVEFfVFlQRV9VSU5UNjQsIDAsCisJCQkmaXNfb2ZmbGluZSk7CisJbnZsaXN0X2ZpbmQo bnZsaXN0LCBaUE9PTF9DT05GSUdfUkVNT1ZFRCwgREFUQV9UWVBFX1VJTlQ2NCwgMCwKKwkJCSZp c19yZW1vdmVkKTsKKwludmxpc3RfZmluZChudmxpc3QsIFpQT09MX0NPTkZJR19GQVVMVEVELCBE QVRBX1RZUEVfVUlOVDY0LCAwLAorCQkJJmlzX2ZhdWx0ZWQpOworCW52bGlzdF9maW5kKG52bGlz dCwgWlBPT0xfQ09ORklHX0RFR1JBREVELCBEQVRBX1RZUEVfVUlOVDY0LCAwLAorCQkJJmlzX2Rl Z3JhZGVkKTsKKworCXZkZXYgPSB2ZGV2X2ZpbmQoZ3VpZCk7CisJaWYgKCF2ZGV2KSB7CisKKwkJ aXNfbmV3ID0gMTsKKworCQlpZiAoIXN0cmNtcCh0eXBlLCBWREVWX1RZUEVfTUlSUk9SKSkKKwkJ CXZkZXYgPSB2ZGV2X2NyZWF0ZShndWlkLCB2ZGV2X21pcnJvcl9yZWFkKTsKKwkJZWxzZSBpZiAo IXN0cmNtcCh0eXBlLCBWREVWX1RZUEVfUkFJRFopKQorCQkJdmRldiA9IHZkZXZfY3JlYXRlKGd1 aWQsIHZkZXZfcmFpZHpfcmVhZCk7CisJCWVsc2UKKwkJCXZkZXYgPSB2ZGV2X2NyZWF0ZShndWlk LCB2ZGV2X2Rpc2tfcmVhZCk7CisKKwkJdmRldi0+dl9pZCA9IGlkOworCQlpZiAobnZsaXN0X2Zp bmQobnZsaXN0LCBaUE9PTF9DT05GSUdfQVNISUZULAorCQkJREFUQV9UWVBFX1VJTlQ2NCwgMCwg JmFzaGlmdCkgPT0gMCkKKwkJCXZkZXYtPnZfYXNoaWZ0ID0gYXNoaWZ0OworCQllbHNlCisJCQl2 ZGV2LT52X2FzaGlmdCA9IDA7CisJCWlmIChudmxpc3RfZmluZChudmxpc3QsIFpQT09MX0NPTkZJ R19OUEFSSVRZLAorCQkJREFUQV9UWVBFX1VJTlQ2NCwgMCwgJm5wYXJpdHkpID09IDApCisJCQl2 ZGV2LT52X25wYXJpdHkgPSBucGFyaXR5OworCQllbHNlCisJCQl2ZGV2LT52X25wYXJpdHkgPSAw OworCQlpZiAobnZsaXN0X2ZpbmQobnZsaXN0LCBaUE9PTF9DT05GSUdfUEFUSCwKKwkJCQlEQVRB X1RZUEVfU1RSSU5HLCAwLCAmcGF0aCkgPT0gMCkgeworCQkJaWYgKHN0cmxlbihwYXRoKSA+IDUK KwkJCSAgICAmJiBwYXRoWzBdID09ICcvJworCQkJICAgICYmIHBhdGhbMV0gPT0gJ2QnCisJCQkg ICAgJiYgcGF0aFsyXSA9PSAnZScKKwkJCSAgICAmJiBwYXRoWzNdID09ICd2JworCQkJICAgICYm IHBhdGhbNF0gPT0gJy8nKQorCQkJCXBhdGggKz0gNTsKKwkJCXZkZXYtPnZfbmFtZSA9IHN0cmR1 cChwYXRoKTsKKwkJfSBlbHNlIHsKKwkJCWlmICghc3RyY21wKHR5cGUsICJyYWlkeiIpKSB7CisJ CQkJaWYgKHZkZXYtPnZfbnBhcml0eSA9PSAxKQorCQkJCQl2ZGV2LT52X25hbWUgPSAicmFpZHox IjsKKwkJCQllbHNlCisJCQkJCXZkZXYtPnZfbmFtZSA9ICJyYWlkejIiOworCQkJfSBlbHNlIHsK KwkJCQl2ZGV2LT52X25hbWUgPSBzdHJkdXAodHlwZSk7CisJCQl9CisJCX0KKworCQlpZiAoaXNf b2ZmbGluZSkKKwkJCXZkZXYtPnZfc3RhdGUgPSBWREVWX1NUQVRFX09GRkxJTkU7CisJCWVsc2Ug aWYgKGlzX3JlbW92ZWQpCisJCQl2ZGV2LT52X3N0YXRlID0gVkRFVl9TVEFURV9SRU1PVkVEOwor CQllbHNlIGlmIChpc19mYXVsdGVkKQorCQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfRkFV TFRFRDsKKwkJZWxzZSBpZiAoaXNfZGVncmFkZWQpCisJCQl2ZGV2LT52X3N0YXRlID0gVkRFVl9T VEFURV9ERUdSQURFRDsKKwkJZWxzZQorCQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfSEVB TFRIWTsKIAotCXZkZXYtPnZfaWQgPSBpZDsKLQlpZiAobnZsaXN0X2ZpbmQobnZsaXN0LCBaUE9P TF9DT05GSUdfQVNISUZULAotCQlEQVRBX1RZUEVfVUlOVDY0LCAwLCAmYXNoaWZ0KSA9PSAwKQot CQl2ZGV2LT52X2FzaGlmdCA9IGFzaGlmdDsKLQllbHNlCi0JCXZkZXYtPnZfYXNoaWZ0ID0gMDsK LQlpZiAobnZsaXN0X2ZpbmQobnZsaXN0LCBaUE9PTF9DT05GSUdfTlBBUklUWSwKLQkJREFUQV9U WVBFX1VJTlQ2NCwgMCwgJm5wYXJpdHkpID09IDApCi0JCXZkZXYtPnZfbnBhcml0eSA9IG5wYXJp dHk7Ci0JZWxzZQotCQl2ZGV2LT52X25wYXJpdHkgPSAwOwotCWlmIChudmxpc3RfZmluZChudmxp c3QsIFpQT09MX0NPTkZJR19QQVRILAotCQkJREFUQV9UWVBFX1NUUklORywgMCwgJnBhdGgpID09 IDApIHsKLQkJaWYgKHN0cmxlbihwYXRoKSA+IDUKLQkJICAgICYmIHBhdGhbMF0gPT0gJy8nCi0J CSAgICAmJiBwYXRoWzFdID09ICdkJwotCQkgICAgJiYgcGF0aFsyXSA9PSAnZScKLQkJICAgICYm IHBhdGhbM10gPT0gJ3YnCi0JCSAgICAmJiBwYXRoWzRdID09ICcvJykKLQkJCXBhdGggKz0gNTsK LQkJdmRldi0+dl9uYW1lID0gc3RyZHVwKHBhdGgpOwogCX0gZWxzZSB7Ci0JCWlmICghc3RyY21w KHR5cGUsICJyYWlkeiIpKSB7Ci0JCQlpZiAodmRldi0+dl9ucGFyaXR5ID09IDEpCi0JCQkJdmRl di0+dl9uYW1lID0gInJhaWR6MSI7CisKKwkJaXNfbmV3ID0gMDsKKworCQlpZiAoaXNfbmV3ZXIp IHsKKworCQkJLyogV2UndmUgYWxyZWFkeSBzZWVuIHRoaXMgdmRldiwgYnV0IGZyb20gYW4gb2xk ZXIKKwkJCSAqIHZkZXYgbGFiZWwsIHNvIGxldCdzIHJlZnJlc2ggaXRzIHN0YXRlIGZyb20gdGhl CisJCQkgKiBuZXdlciBsYWJlbC4gKi8KKworCQkJaWYgKGlzX29mZmxpbmUpCisJCQkJdmRldi0+ dl9zdGF0ZSA9IFZERVZfU1RBVEVfT0ZGTElORTsKKwkJCWVsc2UgaWYgKGlzX3JlbW92ZWQpCisJ CQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfUkVNT1ZFRDsKKwkJCWVsc2UgaWYgKGlzX2Zh dWx0ZWQpCisJCQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfRkFVTFRFRDsKKwkJCWVsc2Ug aWYgKGlzX2RlZ3JhZGVkKQorCQkJCXZkZXYtPnZfc3RhdGUgPSBWREVWX1NUQVRFX0RFR1JBREVE OwogCQkJZWxzZQotCQkJCXZkZXYtPnZfbmFtZSA9ICJyYWlkejIiOwotCQl9IGVsc2UgewotCQkJ dmRldi0+dl9uYW1lID0gc3RyZHVwKHR5cGUpOworCQkJCXZkZXYtPnZfc3RhdGUgPSBWREVWX1NU QVRFX0hFQUxUSFk7CiAJCX0KIAl9CisKIAlyYyA9IG52bGlzdF9maW5kKG52bGlzdCwgWlBPT0xf Q09ORklHX0NISUxEUkVOLAogCQkJIERBVEFfVFlQRV9OVkxJU1RfQVJSQVksICZua2lkcywgJmtp ZHMpOwogCS8qCkBAIC00ODgsMTAgKzUyOSwxMiBAQAogCWlmIChyYyA9PSAwKSB7CiAJCXZkZXYt PnZfbmNoaWxkcmVuID0gbmtpZHM7CiAJCWZvciAoaSA9IDA7IGkgPCBua2lkczsgaSsrKSB7Ci0J CQlyYyA9IHZkZXZfaW5pdF9mcm9tX252bGlzdChraWRzLCAma2lkKTsKKwkJCXJjID0gdmRldl9p bml0X2Zyb21fbnZsaXN0KGtpZHMsICZraWQsIGlzX25ld2VyKTsKIAkJCWlmIChyYykKIAkJCQly ZXR1cm4gKHJjKTsKLQkJCVNUQUlMUV9JTlNFUlRfVEFJTCgmdmRldi0+dl9jaGlsZHJlbiwga2lk LCB2X2NoaWxkbGluayk7CisJCQlpZiAoaXNfbmV3KQorCQkJCVNUQUlMUV9JTlNFUlRfVEFJTCgm dmRldi0+dl9jaGlsZHJlbiwga2lkLAorCQkJCQkJICAgdl9jaGlsZGxpbmspOwogCQkJa2lkcyA9 IG52bGlzdF9uZXh0KGtpZHMpOwogCQl9CiAJfSBlbHNlIHsKQEAgLTU5Myw3ICs2MzYsOSBAQAog CQkiVU5LTk9XTiIsCiAJCSJDTE9TRUQiLAogCQkiT0ZGTElORSIsCisJCSJSRU1PVkVEIiwKIAkJ IkNBTlRfT1BFTiIsCisJCSJGQVVMVEVEIiwKIAkJIkRFR1JBREVEIiwKIAkJIk9OTElORSIKIAl9 OwpAQCAtNzExLDcgKzc1Niw3IEBACiAJdWludDY0X3QgcG9vbF90eGcsIHBvb2xfZ3VpZDsKIAlj b25zdCBjaGFyICpwb29sX25hbWU7CiAJY29uc3QgdW5zaWduZWQgY2hhciAqdmRldnM7Ci0JaW50 IGksIHJjOworCWludCBpLCByYywgaXNfbmV3ZXI7CiAJY2hhciB1cGJ1ZlsxMDI0XTsKIAljb25z dCBzdHJ1Y3QgdWJlcmJsb2NrICp1cDsKIApAQCAtNzkzLDEyICs4MzgsMTUgQEAKIAkJc3BhID0g c3BhX2NyZWF0ZShwb29sX2d1aWQpOwogCQlzcGEtPnNwYV9uYW1lID0gc3RyZHVwKHBvb2xfbmFt ZSk7CiAJfQotCWlmIChwb29sX3R4ZyA+IHNwYS0+c3BhX3R4ZykKKwlpZiAocG9vbF90eGcgPiBz cGEtPnNwYV90eGcpIHsKIAkJc3BhLT5zcGFfdHhnID0gcG9vbF90eGc7CisJCWlzX25ld2VyID0g MTsKKwl9IGVsc2UKKwkJaXNfbmV3ZXIgPSAwOwogCiAJLyoKIAkgKiBHZXQgdGhlIHZkZXYgdHJl ZSBhbmQgY3JlYXRlIG91ciBpbi1jb3JlIGNvcHkgb2YgaXQuCi0JICogSWYgd2UgYWxyZWFkeSBo YXZlIGEgaGVhbHRoeSB2ZGV2IHdpdGggdGhpcyBndWlkLCB0aGlzIG11c3QKKwkgKiBJZiB3ZSBh bHJlYWR5IGhhdmUgYSB2ZGV2IHdpdGggdGhpcyBndWlkLCB0aGlzIG11c3QKIAkgKiBiZSBzb21l IGtpbmQgb2YgYWxpYXMgKG92ZXJsYXBwaW5nIHNsaWNlcywgZGFuZ2Vyb3VzbHkgZGVkaWNhdGVk CiAJICogZGlza3MgZXRjKS4KIAkgKi8KQEAgLTgwOCwxNiArODU2LDE2IEBACiAJCXJldHVybiAo RUlPKTsKIAl9CiAJdmRldiA9IHZkZXZfZmluZChndWlkKTsKLQlpZiAodmRldiAmJiB2ZGV2LT52 X3N0YXRlID09IFZERVZfU1RBVEVfSEVBTFRIWSkgeworCWlmICh2ZGV2ICYmIHZkZXYtPnZfcGh5 c19yZWFkKQkvKiBIYXMgdGhpcyB2ZGV2IGFscmVhZHkgYmVlbiBpbml0ZWQ/ICovCiAJCXJldHVy biAoRUlPKTsKLQl9CiAKIAlpZiAobnZsaXN0X2ZpbmQobnZsaXN0LAogCQkJWlBPT0xfQ09ORklH X1ZERVZfVFJFRSwKIAkJCURBVEFfVFlQRV9OVkxJU1QsIDAsICZ2ZGV2cykpIHsKIAkJcmV0dXJu IChFSU8pOwogCX0KLQlyYyA9IHZkZXZfaW5pdF9mcm9tX252bGlzdCh2ZGV2cywgJnRvcF92ZGV2 KTsKKworCXJjID0gdmRldl9pbml0X2Zyb21fbnZsaXN0KHZkZXZzLCAmdG9wX3ZkZXYsIGlzX25l d2VyKTsKIAlpZiAocmMpCiAJCXJldHVybiAocmMpOwogCkBAIC04MzgsNyArODg2LDYgQEAKIAlp ZiAodmRldikgewogCQl2ZGV2LT52X3BoeXNfcmVhZCA9IHJlYWQ7CiAJCXZkZXYtPnZfcmVhZF9w cml2ID0gcmVhZF9wcml2OwotCQl2ZGV2LT52X3N0YXRlID0gVkRFVl9TVEFURV9IRUFMVEhZOwog CX0gZWxzZSB7CiAJCXByaW50ZigiWkZTOiBpbmNvbnNpc3RlbnQgbnZsaXN0IGNvbnRlbnRzXG4i KTsKIAkJcmV0dXJuIChFSU8pOwo= --00504502ba97f04f22047ab80c11 Content-Type: application/octet-stream; name="correct-status-nowhitespace.patch" Content-Disposition: attachment; filename="correct-status-nowhitespace.patch" Content-Transfer-Encoding: base64 X-Attachment-Id: f_g37u3wsw1 LS0tIHpmcy96ZnNpbXBsLmMuT1JJRwkyMDA5LTExLTIxIDE1OjAyOjM1LjAwMDAwMDAwMCArMDAw MAorKysgemZzL3pmc2ltcGwuYwkyMDA5LTEyLTExIDEyOjI2OjM4Ljc0NTk1NDI5NyArMDAwMApA QCAtNTEsNyArNTEsNyBAQAogc3RhdGljIGNoYXIgKnphcF9zY3JhdGNoOwogc3RhdGljIGNoYXIg Knpmc190ZW1wX2J1ZiwgKnpmc190ZW1wX2VuZCwgKnpmc190ZW1wX3B0cjsKIAotI2RlZmluZSBU RU1QX1NJWkUJKDEqU1BBX01BWEJMT0NLU0laRSkKKyNkZWZpbmUgVEVNUF9TSVpFCSgxMDI0ICog MTAyNCkKIAogc3RhdGljIGludCB6aW9fcmVhZChzcGFfdCAqc3BhLCBjb25zdCBibGtwdHJfdCAq YnAsIHZvaWQgKmJ1Zik7CiAKQEAgLTQwNCw3ICs0MDQsNyBAQAogfQogCiBzdGF0aWMgaW50Ci12 ZGV2X2luaXRfZnJvbV9udmxpc3QoY29uc3QgdW5zaWduZWQgY2hhciAqbnZsaXN0LCB2ZGV2X3Qg Kip2ZGV2cCkKK3ZkZXZfaW5pdF9mcm9tX252bGlzdChjb25zdCB1bnNpZ25lZCBjaGFyICpudmxp c3QsIHZkZXZfdCAqKnZkZXZwLCBpbnQgaXNfbmV3ZXIpCiB7CiAJaW50IHJjOwogCXVpbnQ2NF90 IGd1aWQsIGlkLCBhc2hpZnQsIG5wYXJpdHk7CkBAIC00MTIsNyArNDEyLDggQEAKIAljb25zdCBj aGFyICpwYXRoOwogCXZkZXZfdCAqdmRldiwgKmtpZDsKIAljb25zdCB1bnNpZ25lZCBjaGFyICpr aWRzOwotCWludCBua2lkcywgaTsKKwlpbnQgbmtpZHMsIGksIGlzX25ldzsKKwl1aW50NjRfdCBp c19vZmZsaW5lLCBpc19mYXVsdGVkLCBpc19kZWdyYWRlZCwgaXNfcmVtb3ZlZDsKIAogCWlmIChu dmxpc3RfZmluZChudmxpc3QsIFpQT09MX0NPTkZJR19HVUlELAogCQkJREFUQV9UWVBFX1VJTlQ2 NCwgMCwgJmd1aWQpCkBAIC00MjQsMTcgKzQyNSw2IEBACiAJCXJldHVybiAoRU5PRU5UKTsKIAl9 CiAKLQkvKgotCSAqIEFzc3VtZSB0aGF0IGlmIHdlJ3ZlIHNlZW4gdGhpcyB2ZGV2IHRyZWUgYmVm b3JlLCB0aGlzIG9uZQotCSAqIHdpbGwgYmUgaWRlbnRpY2FsLgotCSAqLwotCXZkZXYgPSB2ZGV2 X2ZpbmQoZ3VpZCk7Ci0JaWYgKHZkZXYpIHsKLQkJaWYgKHZkZXZwKQotCQkJKnZkZXZwID0gdmRl djsKLQkJcmV0dXJuICgwKTsKLQl9Ci0KIAlpZiAoc3RyY21wKHR5cGUsIFZERVZfVFlQRV9NSVJS T1IpCiAJICAgICYmIHN0cmNtcCh0eXBlLCBWREVWX1RZUEVfRElTSykKIAkgICAgJiYgc3RyY21w KHR5cGUsIFZERVZfVFlQRV9SQUlEWikpIHsKQEAgLTQ0Miw2ICs0MzIsMjIgQEAKIAkJcmV0dXJu IChFSU8pOwogCX0KIAorCWlzX29mZmxpbmUgPSBpc19yZW1vdmVkID0gaXNfZmF1bHRlZCA9IGlz X2RlZ3JhZGVkID0gMDsKKworCW52bGlzdF9maW5kKG52bGlzdCwgWlBPT0xfQ09ORklHX09GRkxJ TkUsIERBVEFfVFlQRV9VSU5UNjQsIDAsCisJCQkmaXNfb2ZmbGluZSk7CisJbnZsaXN0X2ZpbmQo bnZsaXN0LCBaUE9PTF9DT05GSUdfUkVNT1ZFRCwgREFUQV9UWVBFX1VJTlQ2NCwgMCwKKwkJCSZp c19yZW1vdmVkKTsKKwludmxpc3RfZmluZChudmxpc3QsIFpQT09MX0NPTkZJR19GQVVMVEVELCBE QVRBX1RZUEVfVUlOVDY0LCAwLAorCQkJJmlzX2ZhdWx0ZWQpOworCW52bGlzdF9maW5kKG52bGlz dCwgWlBPT0xfQ09ORklHX0RFR1JBREVELCBEQVRBX1RZUEVfVUlOVDY0LCAwLAorCQkJJmlzX2Rl Z3JhZGVkKTsKKworCXZkZXYgPSB2ZGV2X2ZpbmQoZ3VpZCk7CisJaWYgKCF2ZGV2KSB7CisKKwkJ aXNfbmV3ID0gMTsKKwogCWlmICghc3RyY21wKHR5cGUsIFZERVZfVFlQRV9NSVJST1IpKQogCQl2 ZGV2ID0gdmRldl9jcmVhdGUoZ3VpZCwgdmRldl9taXJyb3JfcmVhZCk7CiAJZWxzZSBpZiAoIXN0 cmNtcCh0eXBlLCBWREVWX1RZUEVfUkFJRFopKQpAQCAtNDgwLDYgKzQ4Niw0MSBAQAogCQkJdmRl di0+dl9uYW1lID0gc3RyZHVwKHR5cGUpOwogCQl9CiAJfQorCisJCWlmIChpc19vZmZsaW5lKQor CQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfT0ZGTElORTsKKwkJZWxzZSBpZiAoaXNfcmVt b3ZlZCkKKwkJCXZkZXYtPnZfc3RhdGUgPSBWREVWX1NUQVRFX1JFTU9WRUQ7CisJCWVsc2UgaWYg KGlzX2ZhdWx0ZWQpCisJCQl2ZGV2LT52X3N0YXRlID0gVkRFVl9TVEFURV9GQVVMVEVEOworCQll bHNlIGlmIChpc19kZWdyYWRlZCkKKwkJCXZkZXYtPnZfc3RhdGUgPSBWREVWX1NUQVRFX0RFR1JB REVEOworCQllbHNlCisJCQl2ZGV2LT52X3N0YXRlID0gVkRFVl9TVEFURV9IRUFMVEhZOworCisJ fSBlbHNlIHsKKworCQlpc19uZXcgPSAwOworCisJCWlmIChpc19uZXdlcikgeworCisJCQkvKiBX ZSd2ZSBhbHJlYWR5IHNlZW4gdGhpcyB2ZGV2LCBidXQgZnJvbSBhbiBvbGRlcgorCQkJICogdmRl diBsYWJlbCwgc28gbGV0J3MgcmVmcmVzaCBpdHMgc3RhdGUgZnJvbSB0aGUKKwkJCSAqIG5ld2Vy IGxhYmVsLiAqLworCisJCQlpZiAoaXNfb2ZmbGluZSkKKwkJCQl2ZGV2LT52X3N0YXRlID0gVkRF Vl9TVEFURV9PRkZMSU5FOworCQkJZWxzZSBpZiAoaXNfcmVtb3ZlZCkKKwkJCQl2ZGV2LT52X3N0 YXRlID0gVkRFVl9TVEFURV9SRU1PVkVEOworCQkJZWxzZSBpZiAoaXNfZmF1bHRlZCkKKwkJCQl2 ZGV2LT52X3N0YXRlID0gVkRFVl9TVEFURV9GQVVMVEVEOworCQkJZWxzZSBpZiAoaXNfZGVncmFk ZWQpCisJCQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfREVHUkFERUQ7CisJCQllbHNlCisJ CQkJdmRldi0+dl9zdGF0ZSA9IFZERVZfU1RBVEVfSEVBTFRIWTsKKwkJfQorCX0KKwogCXJjID0g bnZsaXN0X2ZpbmQobnZsaXN0LCBaUE9PTF9DT05GSUdfQ0hJTERSRU4sCiAJCQkgREFUQV9UWVBF X05WTElTVF9BUlJBWSwgJm5raWRzLCAma2lkcyk7CiAJLyoKQEAgLTQ4OCwxMCArNTI5LDEyIEBA CiAJaWYgKHJjID09IDApIHsKIAkJdmRldi0+dl9uY2hpbGRyZW4gPSBua2lkczsKIAkJZm9yIChp ID0gMDsgaSA8IG5raWRzOyBpKyspIHsKLQkJCXJjID0gdmRldl9pbml0X2Zyb21fbnZsaXN0KGtp ZHMsICZraWQpOworCQkJcmMgPSB2ZGV2X2luaXRfZnJvbV9udmxpc3Qoa2lkcywgJmtpZCwgaXNf bmV3ZXIpOwogCQkJaWYgKHJjKQogCQkJCXJldHVybiAocmMpOwotCQkJU1RBSUxRX0lOU0VSVF9U QUlMKCZ2ZGV2LT52X2NoaWxkcmVuLCBraWQsIHZfY2hpbGRsaW5rKTsKKwkJCWlmIChpc19uZXcp CisJCQkJU1RBSUxRX0lOU0VSVF9UQUlMKCZ2ZGV2LT52X2NoaWxkcmVuLCBraWQsCisJCQkJCQkg ICB2X2NoaWxkbGluayk7CiAJCQlraWRzID0gbnZsaXN0X25leHQoa2lkcyk7CiAJCX0KIAl9IGVs c2UgewpAQCAtNTkzLDcgKzYzNiw5IEBACiAJCSJVTktOT1dOIiwKIAkJIkNMT1NFRCIsCiAJCSJP RkZMSU5FIiwKKwkJIlJFTU9WRUQiLAogCQkiQ0FOVF9PUEVOIiwKKwkJIkZBVUxURUQiLAogCQki REVHUkFERUQiLAogCQkiT05MSU5FIgogCX07CkBAIC03MTEsNyArNzU2LDcgQEAKIAl1aW50NjRf dCBwb29sX3R4ZywgcG9vbF9ndWlkOwogCWNvbnN0IGNoYXIgKnBvb2xfbmFtZTsKIAljb25zdCB1 bnNpZ25lZCBjaGFyICp2ZGV2czsKLQlpbnQgaSwgcmM7CisJaW50IGksIHJjLCBpc19uZXdlcjsK IAljaGFyIHVwYnVmWzEwMjRdOwogCWNvbnN0IHN0cnVjdCB1YmVyYmxvY2sgKnVwOwogCkBAIC03 OTMsMTIgKzgzOCwxNSBAQAogCQlzcGEgPSBzcGFfY3JlYXRlKHBvb2xfZ3VpZCk7CiAJCXNwYS0+ c3BhX25hbWUgPSBzdHJkdXAocG9vbF9uYW1lKTsKIAl9Ci0JaWYgKHBvb2xfdHhnID4gc3BhLT5z cGFfdHhnKQorCWlmIChwb29sX3R4ZyA+IHNwYS0+c3BhX3R4ZykgewogCQlzcGEtPnNwYV90eGcg PSBwb29sX3R4ZzsKKwkJaXNfbmV3ZXIgPSAxOworCX0gZWxzZQorCQlpc19uZXdlciA9IDA7CiAK IAkvKgogCSAqIEdldCB0aGUgdmRldiB0cmVlIGFuZCBjcmVhdGUgb3VyIGluLWNvcmUgY29weSBv ZiBpdC4KLQkgKiBJZiB3ZSBhbHJlYWR5IGhhdmUgYSBoZWFsdGh5IHZkZXYgd2l0aCB0aGlzIGd1 aWQsIHRoaXMgbXVzdAorCSAqIElmIHdlIGFscmVhZHkgaGF2ZSBhIHZkZXYgd2l0aCB0aGlzIGd1 aWQsIHRoaXMgbXVzdAogCSAqIGJlIHNvbWUga2luZCBvZiBhbGlhcyAob3ZlcmxhcHBpbmcgc2xp Y2VzLCBkYW5nZXJvdXNseSBkZWRpY2F0ZWQKIAkgKiBkaXNrcyBldGMpLgogCSAqLwpAQCAtODA4 LDE2ICs4NTYsMTYgQEAKIAkJcmV0dXJuIChFSU8pOwogCX0KIAl2ZGV2ID0gdmRldl9maW5kKGd1 aWQpOwotCWlmICh2ZGV2ICYmIHZkZXYtPnZfc3RhdGUgPT0gVkRFVl9TVEFURV9IRUFMVEhZKSB7 CisJaWYgKHZkZXYgJiYgdmRldi0+dl9waHlzX3JlYWQpCS8qIEhhcyB0aGlzIHZkZXYgYWxyZWFk eSBiZWVuIGluaXRlZD8gKi8KIAkJcmV0dXJuIChFSU8pOwotCX0KIAogCWlmIChudmxpc3RfZmlu ZChudmxpc3QsCiAJCQlaUE9PTF9DT05GSUdfVkRFVl9UUkVFLAogCQkJREFUQV9UWVBFX05WTElT VCwgMCwgJnZkZXZzKSkgewogCQlyZXR1cm4gKEVJTyk7CiAJfQotCXJjID0gdmRldl9pbml0X2Zy b21fbnZsaXN0KHZkZXZzLCAmdG9wX3ZkZXYpOworCisJcmMgPSB2ZGV2X2luaXRfZnJvbV9udmxp c3QodmRldnMsICZ0b3BfdmRldiwgaXNfbmV3ZXIpOwogCWlmIChyYykKIAkJcmV0dXJuIChyYyk7 CiAKQEAgLTgzOCw3ICs4ODYsNiBAQAogCWlmICh2ZGV2KSB7CiAJCXZkZXYtPnZfcGh5c19yZWFk ID0gcmVhZDsKIAkJdmRldi0+dl9yZWFkX3ByaXYgPSByZWFkX3ByaXY7Ci0JCXZkZXYtPnZfc3Rh dGUgPSBWREVWX1NUQVRFX0hFQUxUSFk7CiAJfSBlbHNlIHsKIAkJcHJpbnRmKCJaRlM6IGluY29u c2lzdGVudCBudmxpc3QgY29udGVudHNcbiIpOwogCQlyZXR1cm4gKEVJTyk7Cg== --00504502ba97f04f22047ab80c11-- From owner-freebsd-fs@FreeBSD.ORG Mon Dec 14 23:27:45 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DD56F106568D; Mon, 14 Dec 2009 23:27:45 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B515D8FC14; Mon, 14 Dec 2009 23:27:45 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBENRjYD023496; Mon, 14 Dec 2009 23:27:45 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBENRjnC023492; Mon, 14 Dec 2009 23:27:45 GMT (envelope-from linimon) Date: Mon, 14 Dec 2009 23:27:45 GMT Message-Id: <200912142327.nBENRjnC023492@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/141463: [nfs] [panic] Frequent kernel panics after upgrade from 7.2-STABLE to 8-STABLE [regression] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 14 Dec 2009 23:27:46 -0000 Old Synopsis: Frequent kernel panics after upgrade from 7.2-STABLE to 8-STABLE New Synopsis: [nfs] [panic] Frequent kernel panics after upgrade from 7.2-STABLE to 8-STABLE [regression] Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Dec 14 23:26:42 UTC 2009 Responsible-Changed-Why: possibly nfs-related? http://www.freebsd.org/cgi/query-pr.cgi?pr=141463 From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 01:11:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 29C381065670 for ; Tue, 15 Dec 2009 01:11:29 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from ey-out-2122.google.com (ey-out-2122.google.com [74.125.78.24]) by mx1.freebsd.org (Postfix) with ESMTP id B5B0E8FC08 for ; Tue, 15 Dec 2009 01:11:28 +0000 (UTC) Received: by ey-out-2122.google.com with SMTP id 4so72170eyf.9 for ; Mon, 14 Dec 2009 17:11:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type; bh=qugF5TDDTtogRQJq29D8l+QSCM3ww/rQXTGD8oYyO8Q=; b=Z0WVCqe3CWsmKkjUaKYYsRmRnNHG9vHiT7gtPm08HhDGnQC44BZlWoY4x7bEZfVaSZ ZvgCzgVSLyaA+pJ9B2U7UPC2zJMNouQT/WIwjfdUhe32vvuu+Fbw+GIf0xSWVaUazqUH daRYdpmqncZB7N7U+2pzGN/aSlE9dGuAP9DG0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=awsJGp1CavLjBj7xRh2BO+Nr8i2E+ToJzZfiFiTCb3SSQ7lDvhr/U1Yu610uxp9Vxw qVxxFP2s/NXgLcP+s6Rg9zQfR0P+rBhyLLSY17BBVXZtDdDWJ54IglXicarnSqc90a3o PPVyMxrFyEYh9pouIjpIm9cS5yDoJVNLbrHbw= MIME-Version: 1.0 Received: by 10.216.90.1 with SMTP id d1mr2513907wef.136.1260839487614; Mon, 14 Dec 2009 17:11:27 -0800 (PST) Date: Mon, 14 Dec 2009 20:11:27 -0500 Message-ID: <5f67a8c40912141711x6475032bg539c46753f8099da@mail.gmail.com> From: Zaphod Beeblebrox To: freebsd-fs Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: ZFS sharing spares (does not work). X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 01:11:29 -0000 According to http://docs.sun.com/app/docs/doc/819-5461/gcvcw?a=view , different ZFS pools can share hot spare drives. I've tried this with 7.2p4, and it doesn't work. I've tried both adding the spares at create time and adding the spares after creating the pools. This seems like a useful and (relatively) easy thing to fix? From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 01:54:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F0AA1106568D for ; Tue, 15 Dec 2009 01:54:39 +0000 (UTC) (envelope-from 000.fbsd@quip.cz) Received: from elsa.codelab.cz (elsa.codelab.cz [94.124.105.4]) by mx1.freebsd.org (Postfix) with ESMTP id AF1C88FC08 for ; Tue, 15 Dec 2009 01:54:39 +0000 (UTC) Received: from localhost (localhost.codelab.cz [127.0.0.1]) by elsa.codelab.cz (Postfix) with ESMTP id 5F40F19E045; Tue, 15 Dec 2009 02:54:37 +0100 (CET) Received: from [192.168.1.2] (r5bb235.net.upc.cz [86.49.61.235]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by elsa.codelab.cz (Postfix) with ESMTPSA id 2EECE19E044; Tue, 15 Dec 2009 02:54:35 +0100 (CET) Message-ID: <4B26EC5A.5000405@quip.cz> Date: Tue, 15 Dec 2009 02:54:34 +0100 From: Miroslav Lachman <000.fbsd@quip.cz> User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.1.4) Gecko/20091017 SeaMonkey/2.0 MIME-Version: 1.0 To: Zaphod Beeblebrox References: <5f67a8c40912141711x6475032bg539c46753f8099da@mail.gmail.com> In-Reply-To: <5f67a8c40912141711x6475032bg539c46753f8099da@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs Subject: Re: ZFS sharing spares (does not work). X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 01:54:40 -0000 Zaphod Beeblebrox wrote: > According to http://docs.sun.com/app/docs/doc/819-5461/gcvcw?a=view , > different ZFS pools can share hot spare drives. I've tried this with 7.2p4, > and it doesn't work. I've tried both adding the spares at create time and > adding the spares after creating the pools. This seems like a useful and > (relatively) easy thing to fix? AFAIK ZFS on FreeBSD dosen't support spares at all. Spare can be assigned, but is never used after fail of one member component. (it is handled by some daemon on Solaris) Miroslav Lachman From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 02:00:22 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 01A11106566C for ; Tue, 15 Dec 2009 02:00:22 +0000 (UTC) (envelope-from zbeeble@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id E8CA38FC1B for ; Tue, 15 Dec 2009 02:00:16 +0000 (UTC) Received: by ewy26 with SMTP id 26so4370071ewy.3 for ; Mon, 14 Dec 2009 18:00:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=NODUnmuu/7MOs9dHhIj2HJ6CFIIUutVA9tqelMuExaU=; b=r3SVuZWmJvIs74wmvxAXNLZT3w9O0tNTlf4Cq6MfYMoKkHgaD/FSW7kmIrYkg5HjSo WUc6NXfksWszgiHnKhxBIyebGa1qT71UcWqHmAU3FvKV/ZVXQEdeCY0qESaCIxR5XWEy 3h+Kpwg1giZXzckof7aC1OEXhEw7KvYD+EATc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=vMbsyLlpuap1fg0WML0hWpqlhJqNk97dyJq8cqaNJeQCkkqc7mg3wwP4y5kiBRxOO8 FypH+aSoTQreFc7PN8fDqrKbnDzJWRo1w3+1XVI6W6MYEIcEnC2wEiaUwCXJiONB0VHQ n+rveC9ZuviDZv5RmLV37IpZdwOk5QAGy2zUw= MIME-Version: 1.0 Received: by 10.216.86.14 with SMTP id v14mr2330593wee.183.1260842415046; Mon, 14 Dec 2009 18:00:15 -0800 (PST) In-Reply-To: <4B26EC5A.5000405@quip.cz> References: <5f67a8c40912141711x6475032bg539c46753f8099da@mail.gmail.com> <4B26EC5A.5000405@quip.cz> Date: Mon, 14 Dec 2009 21:00:15 -0500 Message-ID: <5f67a8c40912141800t71bf9637h89f2469846343cd4@mail.gmail.com> From: Zaphod Beeblebrox To: Miroslav Lachman <000.fbsd@quip.cz> Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs Subject: Re: ZFS sharing spares (does not work). X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 02:00:22 -0000 On Mon, Dec 14, 2009 at 8:54 PM, Miroslav Lachman <000.fbsd@quip.cz> wrote: > Zaphod Beeblebrox wrote: > >> According to http://docs.sun.com/app/docs/doc/819-5461/gcvcw?a=view , >> different ZFS pools can share hot spare drives. I've tried this with >> 7.2p4, >> and it doesn't work. I've tried both adding the spares at create time and >> adding the spares after creating the pools. This seems like a useful and >> (relatively) easy thing to fix? >> > > AFAIK ZFS on FreeBSD dosen't support spares at all. Spare can be assigned, > but is never used after fail of one member component. > (it is handled by some daemon on Solaris) Seems like the documentation, at least, should reflect that. > From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 05:02:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0ADC0106566B; Tue, 15 Dec 2009 05:02:48 +0000 (UTC) (envelope-from benschumacher@gmail.com) Received: from mail-pw0-f44.google.com (mail-pw0-f44.google.com [209.85.160.44]) by mx1.freebsd.org (Postfix) with ESMTP id D1F558FC14; Tue, 15 Dec 2009 05:02:47 +0000 (UTC) Received: by pwi15 with SMTP id 15so2669340pwi.3 for ; Mon, 14 Dec 2009 21:02:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=z/FCqm8KJRoPa2z1E4/xiAaQZKPCdLdbxKg4Nahlmt8=; b=hK2aDPr7QfBTt6yMwJBbn9J5sp+sHK2EVcKhqHoLyZySGGoRP3j0OVsWDvakqxPbqz OMeUgcN4rd2Cb4ItmQMAReekw39PropSbaIaKJOpkIzlFUOtVf9CCYfc6XatMC3ev/Nd fWe2+Z8RFXVPUGtMyZg4jfOjsPdyHYjZe9vXY= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; b=a7qDuxUmWoWPNNR8v5GPptySJh4vLuek7WjlfZEgaazrn5CK4ozUtVsUXsnD/VhaF1 0r/lWLYL9Rtux9zg8wqrXRwW92p0uJ+q5Lq822GRWdDybkovD/im3qp6ufeln0w3KhUo BlgC+Ad5J4F9Zdeix51DRB6/d13ApOK80qlP4= MIME-Version: 1.0 Sender: benschumacher@gmail.com Received: by 10.143.27.31 with SMTP id e31mr3788221wfj.173.1260851815244; Mon, 14 Dec 2009 20:36:55 -0800 (PST) Date: Mon, 14 Dec 2009 21:36:55 -0700 X-Google-Sender-Auth: d91771dbaa78f8f6 Message-ID: <9859143f0912142036k3dd0758fmc9cee9b6f2ce4698@mail.gmail.com> From: Ben Schumacher To: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: Subject: SUIDDIR on ZFS? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 05:02:48 -0000 Hello- I'm currently using FreeBSD w/gmirror as a Samba file server for an office with 5-6 users. For ease of administration (and because they need access to each other's data frequently), I have SUIDDIR enabled on the server and have the main shared directory set to "chmod 4770". I have utilities that I run on the server for backup (tarsnap) and virus scanning (clamd/clamdfront) which has so far been working really well. Originally I was planning to setup the data storage on this box to do a mirror of stripes with 4x 500GB drives, but due to some flux (mainly trying to figure out why UFS snapshotting was so slow -- I caught up!) I ended up with a 4 drive mirror and 500GB of space. At any rate, I've been considering switching this to a ZFS RAIDZ now that FreeBSD 8 is released and it seems that folks think it's stable, but I'm curious if it can provide the SUIDDIR functionality I'm currently using. If not, I suppose I could come up with a different solution (suggestions welcome) -- maybe a different Samba configuration, but it took me a while to get where I am with this box now and if I can keep it mostly the same and get the snapshotting abilities of ZFS I'd be really pleased. Any help is greatly appreciated! Thanks, Ben From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 07:51:13 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 065FA10656A5 for ; Tue, 15 Dec 2009 07:51:13 +0000 (UTC) (envelope-from ml@infosec.pl) Received: from v027580.home.net.pl (v027580.home.net.pl [89.161.156.148]) by mx1.freebsd.org (Postfix) with SMTP id 4717D8FC28 for ; Tue, 15 Dec 2009 07:51:11 +0000 (UTC) Received: from 94-193-57-116.zone7.bethere.co.uk (94.193.57.116) (HELO [192.168.1.66]) by freeside.home.pl (89.161.156.148) with SMTP (IdeaSmtpServer v0.70) id 3456077fb457c81b; Tue, 15 Dec 2009 08:51:12 +0100 Message-ID: <4B273FEE.6050206@infosec.pl> Date: Tue, 15 Dec 2009 07:51:10 +0000 From: Michal User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.1.5) Gecko/20091214 Thunderbird/3.0 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <200911102227.nAAMRXTf073603@svn.freebsd.org> <20091110224524.GC3194@garage.freebsd.pl> <4B2139FE.8020200@barryp.org> In-Reply-To: <4B2139FE.8020200@barryp.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: HEADS UP: Important bug fix in ZFS replay code! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 07:51:13 -0000 On 10/12/2009 18:12, Barry Pederson wrote: > > I just noticed this fix didn't make it into 8.0, I just had an > 8.0-RELEASE-p1 machine crash and come back with a bunch of 07777 files. > > Maybe this should be documented as an errata or security advisory. > I just got bitten by the same bug and only thanks to your reminder I specifically checked for these files. Michal -- "The problem with common sense is that it is not all that common." -unknown From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 12:39:28 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CE2641065672 for ; Tue, 15 Dec 2009 12:39:28 +0000 (UTC) (envelope-from freebsd-fs@m.gmane.org) Received: from lo.gmane.org (lo.gmane.org [80.91.229.12]) by mx1.freebsd.org (Postfix) with ESMTP id 2A4E18FC0A for ; Tue, 15 Dec 2009 12:39:27 +0000 (UTC) Received: from list by lo.gmane.org with local (Exim 4.50) id 1NKWgX-0002yb-6y for freebsd-fs@freebsd.org; Tue, 15 Dec 2009 13:39:25 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 15 Dec 2009 13:39:25 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 15 Dec 2009 13:39:25 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-fs@freebsd.org From: Ivan Voras Date: Tue, 15 Dec 2009 13:39:04 +0100 Lines: 95 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@ger.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 2.0.0.23 (X11/20091210) Sender: news Cc: freebsd-hackers@freebsd.org Subject: ZFS, compression, system load, pauses (livelocks?) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 12:39:29 -0000 The context of this post is file servers running FreeBSD 8 and ZFS with compressed file systems on low-end hardware, or actually high-end hardware on VMWare ESX 3.5 and 4, which kind of makes it low-end as far as storage is concerned. The servers are standby backup mirrors of production servers - thus many writes, few reads. Running this setup I notice two things: 1) load averages get very high, though the only usage these systems get is file system usage: last pid: 2270; load averages: 19.02, 14.58, 9.07 up 0+09:47:03 11:29:04 2) long pauses, in what looks like vfs.zfs.txg.timeout second intervals, which seemengly block everything, or at least the entire userland. These pauses are sometimes so long that file transfers fail, which must be avoided. I think these two are connected. Monitoring the system with "top" and "iostat" reveals that the state between the pauses are mostly idle (data is being sent to the server over a gbit network in rates of 15+ MB/s). During the pauses there is heavy IO activity which reflects both in top - kernel threads spa_zio_* (ZFS taskqueues) are hogging the CPU and immediately after the pause iostat reveals several tens of MB written to the drives. Except for the pause, this is expected - ZFS is compressing data before writing it down. The pauses are interesting. Immediately after such pause the system status is similar to this one: 91 processes: 12 running, 63 sleeping, 16 waiting CPU: 1.4% user, 0.0% nice, 96.3% system, 0.3% interrupt, 2.0% idle Mem: 75M Active, 122M Inact, 419M Wired, 85M Buf, 125M Free (this is the first "top" output after a pause). Looking at the list of processes it looks like a large number of kernel and userland processes are woken up at once. From the kernel side there are regularily all g_* threads, but also unrelated threads like bufdaemon, softdepflush, etc. and from the userland - top, syslog, cron, etc. It is like ZFS livelocks everything else. The effects of this can be lessened by reducing vfs.zfs.txg.timeout, vfs.zfs.vdev.max_pending and using the attached patch which creates NCPU ZFS worker threads instead of hardcoding them to "8". The patch will probably also help the high-end hardware end of the spectrum, where 16-core users will finally be able to dedicate them all to ZFS :) With these measures I have reduced pauses to a second or two every 10 seconds instead of up to tens of seconds every 30 seconds, which is good enough so transfers don't timeout, but could probably be better. Any ideas on the "pauses" issue? The taskq-thread patch is below. If nobody objects (pjd? I don't know how harder will it make it for you to import future ZFS versions?) I will commit it soon. --- /sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c 2009-03-29 01:31:42.000000000 +0100 +++ spa.c 2009-12-15 13:36:05.000000000 +0100 @@ -58,15 +58,16 @@ #include #include #include +#include #include "zfs_prop.h" #include "zfs_comutil.h" -int zio_taskq_threads[ZIO_TYPES][ZIO_TASKQ_TYPES] = { +static int zio_taskq_threads[ZIO_TYPES][ZIO_TASKQ_TYPES] = { /* ISSUE INTR */ { 1, 1 }, /* ZIO_TYPE_NULL */ - { 1, 8 }, /* ZIO_TYPE_READ */ - { 8, 1 }, /* ZIO_TYPE_WRITE */ + { 1, -1 }, /* ZIO_TYPE_READ */ + { -1, 1 }, /* ZIO_TYPE_WRITE */ { 1, 1 }, /* ZIO_TYPE_FREE */ { 1, 1 }, /* ZIO_TYPE_CLAIM */ { 1, 1 }, /* ZIO_TYPE_IOCTL */ @@ -498,7 +499,8 @@ for (int t = 0; t < ZIO_TYPES; t++) { for (int q = 0; q < ZIO_TASKQ_TYPES; q++) { spa->spa_zio_taskq[t][q] = taskq_create("spa_zio", - zio_taskq_threads[t][q], maxclsyspri, 50, + zio_taskq_threads[t][q] == -1 ? mp_ncpus : zio_taskq_threads[t][q], + maxclsyspri, 50, INT_MAX, TASKQ_PREPOPULATE); } } From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 15:34:45 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 49A3B1065695 for ; Tue, 15 Dec 2009 15:34:44 +0000 (UTC) (envelope-from solon@pyro.de) Received: from srv23.fsb.echelon.bnd.org (mail.pyro.de [83.137.99.96]) by mx1.freebsd.org (Postfix) with ESMTP id 512D58FC12 for ; Tue, 15 Dec 2009 15:34:43 +0000 (UTC) Received: from port-87-193-183-44.static.qsc.de ([87.193.183.44] helo=MORDOR) by srv23.fsb.echelon.bnd.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1NKZQ7-000ChL-S9 for freebsd-fs@freebsd.org; Tue, 15 Dec 2009 16:34:43 +0100 Date: Tue, 15 Dec 2009 16:34:20 +0100 From: Solon Lutz X-Mailer: The Bat! (v3.99.25) Professional Organization: pyro.labs berlin X-Priority: 3 (Normal) Message-ID: <568624531.20091215163420@pyro.de> To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -1.4 (-) X-Spam-Report: Spam detection software, running on the system "srv23.fsb.echelon.bnd.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see The administrator of that system for details. Content preview: Hi, are there any cons against building a RaidZ2 with 24 1.5TB drives? In some old postings floating around the net a limit of 9 drives is recommended. Does this still apply to the current ZFS in 8.0? [...] Content analysis details: (-1.4 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.4 ALL_TRUSTED Passed through trusted hosts only via SMTP X-Spam-Flag: NO Subject: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 15:34:45 -0000 Hi, are there any cons against building a RaidZ2 with 24 1.5TB drives? In some old postings floating around the net a limit of 9 drives is recommended. Does this still apply to the current ZFS in 8.0? Best regards, Solon Lutz +-----------------------------------------------+ | Pyro.Labs Berlin - Creativity for tomorrow | | Wasgenstrasse 75/13 - 14129 Berlin, Germany | | www.pyro.de - phone + 49 - 30 - 48 48 58 58 | | info@pyro.de - fax + 49 - 30 - 80 94 03 52 | +-----------------------------------------------+ From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 15:38:17 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 07E3E106568D for ; Tue, 15 Dec 2009 15:38:17 +0000 (UTC) (envelope-from rincebrain@gmail.com) Received: from mail-fx0-f227.google.com (mail-fx0-f227.google.com [209.85.220.227]) by mx1.freebsd.org (Postfix) with ESMTP id 94BD58FC13 for ; Tue, 15 Dec 2009 15:38:16 +0000 (UTC) Received: by fxm27 with SMTP id 27so4315353fxm.3 for ; Tue, 15 Dec 2009 07:38:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=CnYYDsJMmabM7WR5zt63H4oGz0jY01selv9NVbBfzkw=; b=GJi1+kZS/o5ClNYqP3IkOnWGzpJ9AxnmYN0BISyA969Wlt0rtiAN0trkX416QnsPVr +IrZGMR0iVB9l+IXtEJiEqXBMVeg8LYp9EWGaf5LhjVLDVF/w3kK1aJC448cxLDM7kvX 1+AjnxxDPYery6jK2dP34uWmvQKWeDuP3O3yg= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Tjn4R2BOg9QK91vylCfD2hPoaWBI9EeSSl1wrTwxh1/3nH79Y5/8cdk2Wa7ggTz3kd k6EaI3Un3TYfcgCrnuSUuR3kCUKF4eMJRaigKB6ztwfptY3/UnPKWK/IYcvL5vzMkTum bBR6/lWIrlxdGKQMFKevBktkDxzD9bQx6se1k= MIME-Version: 1.0 Received: by 10.239.145.29 with SMTP id q29mr683149hba.127.1260891495442; Tue, 15 Dec 2009 07:38:15 -0800 (PST) In-Reply-To: <568624531.20091215163420@pyro.de> References: <568624531.20091215163420@pyro.de> Date: Tue, 15 Dec 2009 10:38:15 -0500 Message-ID: <5da0588e0912150738y713e55e5i325cc2c7ab71c63b@mail.gmail.com> From: Rich To: Solon Lutz Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 15:38:17 -0000 Generally, it's just recommended that you compose things of smaller sets of RAIDs for performance reasons, because you end up blocking on how fast you can serialize data to multiple drives in some ways, and the IO characteristics aren't necessarily what you want. It's not a hard limit, just a suggestion for performance. Experiment and post numbers! - Rich From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 16:26:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A3251106568F for ; Tue, 15 Dec 2009 16:26:48 +0000 (UTC) (envelope-from wonslung@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id 313968FC24 for ; Tue, 15 Dec 2009 16:26:47 +0000 (UTC) Received: by ewy26 with SMTP id 26so30514ewy.3 for ; Tue, 15 Dec 2009 08:26:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=iLhhcqu2MhcFdvI92VI9l21RIoZRQGag/gkDBEnRJBI=; b=ku+sMQYe1MutiX3hxLCnvW6T7RlH8weMS0/NrMrIb4W+47S2I90dO77BsQoqQhLqK9 khFgRzD/QoYpQ+1OOaGCfEXpzAyvPBcvzIkSQLyIP1SAJGGRWgnp4vtNpPvVIudoB3N6 W3tv6cy/NhleOv+igmlPJOODDaSqXDlpH00pw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=xgt4J/X7NIcZEONxcos5t49oqza13+EcCTW2k9JWJJ1qYSoPM33HwFBofWQ1Vkva1S AjQ8Irr8nLz1zCBCsIPS00CDtn1s/ILxquhnMFQNSwAZw0EMxEYr4MaeOqcJ5PKXkJLR BZV5BEjQxXj2yO3IsBcd5ZNjSk0ZpzmI0HZ/0= MIME-Version: 1.0 Received: by 10.216.89.82 with SMTP id b60mr2585329wef.170.1260892818457; Tue, 15 Dec 2009 08:00:18 -0800 (PST) In-Reply-To: <568624531.20091215163420@pyro.de> References: <568624531.20091215163420@pyro.de> Date: Tue, 15 Dec 2009 11:00:18 -0500 Message-ID: From: Thomas Burgess To: Solon Lutz Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 16:26:48 -0000 you should avoid it. if you want more information about why this is a BAD idea, read this post: http://forums.freebsd.org/showpost.php?p=22025&postcount=17 you should keep in mind each raidz group is seen by zfs more or less as a single hard drive...you should stick to groups NO BIGGER than 9 drives personally, i use 4-5 drives for raidz1 and 6-8 drives for raidz2 also, keep in mind, while you do loose a LITTLE space, you gain a ton more MTTF it's very much worth it. I'd go with 3 groups of 8 or 4 groups of 6 24 in one group = 33 TB 3 groups of 8 = 27 TB 4 groups of 6 = 24 TB personally, i'd go with the 3 groups of 8 On Tue, Dec 15, 2009 at 10:34 AM, Solon Lutz wrote: > Hi, > > are there any cons against building a RaidZ2 with 24 1.5TB drives? > In some old postings floating around the net a limit of 9 drives > is recommended. > Does this still apply to the current ZFS in 8.0? > > > Best regards, > > Solon Lutz > > > +-----------------------------------------------+ > | Pyro.Labs Berlin - Creativity for tomorrow | > | Wasgenstrasse 75/13 - 14129 Berlin, Germany | > | www.pyro.de - phone + 49 - 30 - 48 48 58 58 | > | info@pyro.de - fax + 49 - 30 - 80 94 03 52 | > +-----------------------------------------------+ > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 19:57:08 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2170A106568D for ; Tue, 15 Dec 2009 19:57:08 +0000 (UTC) (envelope-from google@vink.pl) Received: from mail-fx0-f163.google.com (mail-fx0-f163.google.com [209.85.220.163]) by mx1.freebsd.org (Postfix) with ESMTP id 3C9508FC0C for ; Tue, 15 Dec 2009 19:57:06 +0000 (UTC) Received: by fxm3 with SMTP id 3so82843fxm.3 for ; Tue, 15 Dec 2009 11:57:06 -0800 (PST) Received: by 10.223.98.19 with SMTP id o19mr370619fan.82.1260907025765; Tue, 15 Dec 2009 11:57:05 -0800 (PST) Received: from mail-fx0-f227.google.com (mail-fx0-f227.google.com [209.85.220.227]) by mx.google.com with ESMTPS id 13sm72027fxm.13.2009.12.15.11.57.03 (version=SSLv3 cipher=RC4-MD5); Tue, 15 Dec 2009 11:57:05 -0800 (PST) Received: by fxm27 with SMTP id 27so264939fxm.3 for ; Tue, 15 Dec 2009 11:57:03 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.5.8 with SMTP id 8mr8272925fat.48.1260907023184; Tue, 15 Dec 2009 11:57:03 -0800 (PST) In-Reply-To: References: Date: Tue, 15 Dec 2009 20:57:03 +0100 Message-ID: <2ae8edf30912151157t53267adek85af80b1e31fb4b@mail.gmail.com> From: Wiktor Niesiobedzki To: Ivan Voras Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs , freebsd-hackers Subject: Re: ZFS, compression, system load, pauses (livelocks?) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 19:57:08 -0000 2009/12/15 Ivan Voras : > The context of this post is file servers running FreeBSD 8 and ZFS with > compressed file systems on low-end hardware, or actually high-end hardwar= e > on VMWare ESX 3.5 and 4, which kind of makes it low-end as far as storage= is > concerned. The servers are standby backup mirrors of production servers - > thus many writes, few reads. > > Running this setup I notice two things: > > 1) load averages get very high, though the only usage these systems get i= s > file system usage: > 2) long pauses, in what looks like vfs.zfs.txg.timeout second intervals, > which seemengly block everything, or at least the entire userland. These > pauses are sometimes so long that file transfers fail, which must be > avoided. > > Looking at the list of processes it looks like a large number of kernel a= nd > userland processes are woken up at once. From the kernel side there are > regularily all g_* threads, but also unrelated threads like bufdaemon, > softdepflush, etc. and from the userland - top, syslog, cron, etc. It is > like ZFS livelocks everything else. > > Any ideas on the "pauses" issue? > Hi, I've a bit striped your post. It's kind of "me too" message (more details here: http://lists.freebsd.org/pipermail/freebsd-geom/2009-December= /003810.html). What I've figured out so far is, that lowering the kernel thread priority (as pjd@ suggested) gives quite promising results (no livelocks at all). Though my bottleneck were caused by GELI thread. The pattern there is like this: sched_prio(curthread, PRIBIO); [...] msleep(sc, &sc->sc_queue_mtx, PDROP | PRIBIO, "geli:w", 0); I'm running right now with changed wersion - where I have: msleep(sc, &sc->sc_queue_mtx, PDROP, "geli:w", 0); So I don't change initial thread priority. It doesn't give such result as using PUSER prio, but I fear, that using PUSER may cause livelocks in some other cases. This helps my case (geli encryption and periodic locks during ZFS transaction commits) with some performance penalty, but I have similar problems in other cases. When I run: # zfs scrub tank Then "kernel" system process/thread consumes most of CPU (>95% in system) and load rises to 20+ for the period of scrubbing. During scrub my top screen looks like: last pid: 87570; load averages: 8.26, 2.84, 1.68 199 processes: 3 running, 179 sleeping, 17 waiting CPU: 2.4% user, 0.0% nice, 97.0% system, 0.6% interrupt, 0.0% idle Mem: 66M Active, 6256K Inact, 1027M Wired, 104K Cache, 240K Buf, 839M Free Swap: 4096M Total, 4096M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 0 root 69 -8 0 0K 544K - 104:40 67.19% kernel 24 root 1 -8 - 0K 8K geli:w 9:56 5.66% g_eli[0] ad= 6 26 root 1 -8 - 0K 8K geli:w 9:50 5.47% g_eli[0] ad= 10 25 root 1 -8 - 0K 8K geli:w 9:53 5.37% g_eli[0] ad= 8 8 root 12 -8 - 0K 104K vgeom: 61:35 3.27% zfskern 3 root 1 -8 - 0K 8K - 3:22 0.68% g_up 11 root 17 -60 - 0K 136K WAIT 31:21 0.29% intr Intresting thing, is that I have 17 processes waiting for CPU reported (though only intr is the only process that is reported as in WAIT state - at least for top40 processes). I just wonder, whether this might be a scheduler related issue. I'm thinking about giving a SCHED_4BSD a try. Cheers, Wiktor Niesiob=C4=99dzki From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 21:43:27 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D6298106568F; Tue, 15 Dec 2009 21:43:27 +0000 (UTC) (envelope-from ivoras@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id 41CBB8FC1B; Tue, 15 Dec 2009 21:43:26 +0000 (UTC) Received: by ewy26 with SMTP id 26so361499ewy.3 for ; Tue, 15 Dec 2009 13:43:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:from:date:x-google-sender-auth:message-id:subject:to:cc :content-type:content-transfer-encoding; bh=6Q1uzqzbAbd0UaV+xSljXe5UjfvVqgDaDcMCXGgPnug=; b=uejzTYatuFCfOPDMN35521z9RUMvyUq3pTRZYHpX1H1m+Hu2qP7b1u7v7u6GU37eBH /sgOaUunvxMdvMgTKclNCA1JWVcr4TCHDJVKfJn3X/DHH8A22X1TRxdNpfEyqb8dOGlB 4PAmGVfhJL12JDY6xPEHV7MW84Qe/tjXGbnlM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:from:date :x-google-sender-auth:message-id:subject:to:cc:content-type :content-transfer-encoding; b=pPmwthsDxf2fpAUuOyaQ2sj+NfacUG2pD5dnn6SWiS7Ou/5o12ZSBimKrqnVY1781s X5djZJvl84f0CwNqCc58DwnR1mv+FHQV0v5g7Ie555FdgEuw+stHafMRZBLy5RgEeXTE STTKkIWJ2SwRtSZ4BFsAGi6Zk4k2EAOtjfMW8= MIME-Version: 1.0 Sender: ivoras@gmail.com Received: by 10.216.90.11 with SMTP id d11mr37318wef.187.1260913406139; Tue, 15 Dec 2009 13:43:26 -0800 (PST) In-Reply-To: <2ae8edf30912151157t53267adek85af80b1e31fb4b@mail.gmail.com> References: <2ae8edf30912151157t53267adek85af80b1e31fb4b@mail.gmail.com> From: Ivan Voras Date: Tue, 15 Dec 2009 22:43:06 +0100 X-Google-Sender-Auth: 138eac8a0e663cc7 Message-ID: <9bbcef730912151343p729b6d29w97e56bab96f264f9@mail.gmail.com> To: Wiktor Niesiobedzki Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs , freebsd-hackers Subject: Re: ZFS, compression, system load, pauses (livelocks?) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 21:43:27 -0000 2009/12/15 Wiktor Niesiobedzki : > 2009/12/15 Ivan Voras : >> The context of this post is file servers running FreeBSD 8 and ZFS with >> compressed file systems on low-end hardware, or actually high-end hardwa= re >> on VMWare ESX 3.5 and 4, which kind of makes it low-end as far as storag= e is >> concerned. The servers are standby backup mirrors of production servers = - >> thus many writes, few reads. >> >> Running this setup I notice two things: >> >> 1) load averages get very high, though the only usage these systems get = is >> file system usage: >> 2) long pauses, in what looks like vfs.zfs.txg.timeout second intervals, >> which seemengly block everything, or at least the entire userland. These >> pauses are sometimes so long that file transfers fail, which must be >> avoided. >> >> Looking at the list of processes it looks like a large number of kernel = and >> userland processes are woken up at once. From the kernel side there are >> regularily all g_* threads, but also unrelated threads like bufdaemon, >> softdepflush, etc. and from the userland - top, syslog, cron, etc. It is >> like ZFS livelocks everything else. >> >> Any ideas on the "pauses" issue? >> > > Hi, > > I've a bit striped your post. It's kind of "me too" message (more > details here: http://lists.freebsd.org/pipermail/freebsd-geom/2009-Decemb= er/003810.html). > What I've figured out so far is, that lowering the kernel thread > priority (as pjd@ suggested) gives quite promising results (no > livelocks at all). Though my bottleneck were caused by GELI thread. > > The pattern there is like this: > > sched_prio(curthread, PRIBIO); > [...] > msleep(sc, &sc->sc_queue_mtx, PDROP | PRIBIO, =C2=A0"geli:w", 0); I have tried before reducing priority of ZFS taskqueues but only to PRIBIO, not below it - not much effect wrt "pauses". From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 22:17:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFC15106566C for ; Tue, 15 Dec 2009 22:17:29 +0000 (UTC) (envelope-from areilly@bigpond.net.au) Received: from nskntmtas03p.mx.bigpond.com (nskntmtas03p.mx.bigpond.com [61.9.168.143]) by mx1.freebsd.org (Postfix) with ESMTP id 495988FC1A for ; Tue, 15 Dec 2009 22:17:28 +0000 (UTC) Received: from nskntotgx02p.mx.bigpond.com ([124.188.161.100]) by nskntmtas03p.mx.bigpond.com with ESMTP id <20091215221727.XLDM1310.nskntmtas03p.mx.bigpond.com@nskntotgx02p.mx.bigpond.com>; Tue, 15 Dec 2009 22:17:27 +0000 Received: from duncan.reilly.home ([124.188.161.100]) by nskntotgx02p.mx.bigpond.com with ESMTP id <20091215221727.YNUF5306.nskntotgx02p.mx.bigpond.com@duncan.reilly.home>; Tue, 15 Dec 2009 22:17:27 +0000 Date: Wed, 16 Dec 2009 09:17:27 +1100 From: Andrew Reilly To: Hywel Mallett Message-ID: <20091215221727.GA8137@duncan.reilly.home> References: <20091208224710.GA97620@duncan.reilly.home> <228D9370-4967-4C47-9746-8475DCD4FA27@hmallett.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <228D9370-4967-4C47-9746-8475DCD4FA27@hmallett.co.uk> User-Agent: Mutt/1.4.2.3i X-Authentication-Info: Submitted using SMTP AUTH LOGIN at nskntotgx02p.mx.bigpond.com from [124.188.161.100] using ID areilly@bigpond.net.au at Tue, 15 Dec 2009 22:17:27 +0000 X-RPD-ScanID: Class unknown; VirusThreatLevel unknown, RefID str=0001.0A150204.4B280AF7.00A9,ss=1,fgs=0 X-SIH-MSG-ID: rBg1Edb/TAD0zmQs0WyzOwJxyArnqyN48Z4QX81loRIGTUDCp8DeQ9rHNvZRtdu1xD9JJhiHNGAnaa7jTY3RstCK Cc: freebsd-fs@freebsd.org Subject: Re: On gjournal vs unexpected shutdown (-->fsck) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 22:17:29 -0000 On Tue, Dec 15, 2009 at 09:49:56PM +0000, Hywel Mallett wrote: > > On 8 Dec 2009, at 22:47, Andrew Reilly wrote: > > > Hi there, > > > > I thought that I'd try a gjournal'd UFS on one of my spare > > drives (so: dedicated to the task, formatted from clean, per the > > instructions in the gjournal man page.) The filesystem itself > > seems to be working swimmingly, although it isn't heavily used. > > In the time that I've had it running, though, I've had two power > > outages that have resulted in unexpected shutdowns, and I was > > surprised to find that the boot process did nothing unexpected: > > file system not marked clean: fsck before you can mount. So > > both times I fsck'd the drive, and as near as I can tell this > > took exactly as long as fsck on a regular UFS system of similar > > size. Isn't the journalling operation supposed to confer a > > shortcut benefit here? I know that the man page doesn't mention > > recovery by journal play-back, but I thought that it didn't need > > to: that's the whole point. Is there a step that I'm missing? > > Perhaps a gjournal-aware version of fsck that I should run > > instead of regular fsck, that will quickly mark the file system > > clean? > > > > (Running -current as of last weekend, if that matters.) > > > I assume you've run tunefs -J enable on the filesystems on > the journalled provider? Or used newfs with -J if it's a new > filesystem? > > If I remember correctly it's this flag that fsck checks to see > whether fsck is needed or not. > > You can check whether the flag is set or not by running dumpfs > on the filesystem. Under "flags" it'll say gjournal if the > flag is set. I've just taken the file system off-line and run tunefs -J enable on it, and tunefs said: tunefs: gjournal remains unchanged as enabled so I seem to have set it up properly in the first place. In the "further reading" list on the gjournal article, there is mention of mounting with async,gjournal options, but I see no reference to gjournal in the man pages, so my guess is that this is what has been superceded by the -J tunefs/newfs flag? Cheers, -- Andrew From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 22:39:19 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A448E1065692; Tue, 15 Dec 2009 22:39:19 +0000 (UTC) (envelope-from google@vink.pl) Received: from mail-fx0-f227.google.com (mail-fx0-f227.google.com [209.85.220.227]) by mx1.freebsd.org (Postfix) with ESMTP id E72208FC0A; Tue, 15 Dec 2009 22:39:18 +0000 (UTC) Received: by fxm27 with SMTP id 27so415732fxm.3 for ; Tue, 15 Dec 2009 14:39:17 -0800 (PST) Received: by 10.223.5.77 with SMTP id 13mr171987fau.86.1260916757560; Tue, 15 Dec 2009 14:39:17 -0800 (PST) Received: from mail-fx0-f227.google.com (mail-fx0-f227.google.com [209.85.220.227]) by mx.google.com with ESMTPS id 13sm116223fxm.9.2009.12.15.14.39.17 (version=SSLv3 cipher=RC4-MD5); Tue, 15 Dec 2009 14:39:17 -0800 (PST) Received: by fxm27 with SMTP id 27so415718fxm.3 for ; Tue, 15 Dec 2009 14:39:16 -0800 (PST) MIME-Version: 1.0 Received: by 10.223.143.15 with SMTP id s15mr177897fau.77.1260916756585; Tue, 15 Dec 2009 14:39:16 -0800 (PST) In-Reply-To: <9bbcef730912151343p729b6d29w97e56bab96f264f9@mail.gmail.com> References: <2ae8edf30912151157t53267adek85af80b1e31fb4b@mail.gmail.com> <9bbcef730912151343p729b6d29w97e56bab96f264f9@mail.gmail.com> Date: Tue, 15 Dec 2009 23:39:16 +0100 Message-ID: <2ae8edf30912151439h799277a0ofee13f5aecc55f00@mail.gmail.com> From: Wiktor Niesiobedzki To: Ivan Voras Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs , freebsd-hackers Subject: Re: ZFS, compression, system load, pauses (livelocks?) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 22:39:19 -0000 2009/12/15 Ivan Voras : > I have tried before reducing priority of ZFS taskqueues but only to > PRIBIO, not below it - not much effect wrt "pauses". I was testing with getting the thread as low priority as PUSER (with original pjd@ patch) and it was actually performing better than the current solution, though I found that quite risky. Cheers, Wiktor Niesiob=C4=99dzki From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 23:24:07 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id EE0571065670 for ; Tue, 15 Dec 2009 23:24:07 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: from mail-px0-f182.google.com (mail-px0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id CC85E8FC0A for ; Tue, 15 Dec 2009 23:24:07 +0000 (UTC) Received: by pxi12 with SMTP id 12so275634pxi.3 for ; Tue, 15 Dec 2009 15:24:07 -0800 (PST) Received: by 10.142.5.30 with SMTP id 30mr139889wfe.115.1260919446045; Tue, 15 Dec 2009 15:24:06 -0800 (PST) Received: from mattintosh.spry.com (isaid.donotdelete.com [64.79.222.10]) by mx.google.com with ESMTPS id 20sm210298pzk.5.2009.12.15.15.24.04 (version=TLSv1/SSLv3 cipher=RC4-MD5); Tue, 15 Dec 2009 15:24:05 -0800 (PST) From: Matt Simerson To: Solon Lutz In-Reply-To: <568624531.20091215163420@pyro.de> X-Priority: 3 (Normal) References: <568624531.20091215163420@pyro.de> Message-Id: <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Date: Tue, 15 Dec 2009 15:24:03 -0800 X-Mailer: Apple Mail (2.936) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 23:24:08 -0000 On Dec 15, 2009, at 7:34 AM, Solon Lutz wrote: > Hi, > > are there any cons against building a RaidZ2 with 24 1.5TB drives? Funny, I asked myself a similar question in July 2008. Except I had 1.0TB drives. $ ssh back01 zpool status pool: back01 state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM back01 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 errors: No known data errors $ ssh back02 zpool status pool: back02 state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM back02 ONLINE 0 0 934K raidz1 ONLINE 0 0 0 da0 ONLINE 0 0 0 da1 ONLINE 0 0 0 da2 ONLINE 0 0 0 da3 ONLINE 0 0 0 da4 ONLINE 0 0 0 da5 ONLINE 0 0 0 da6 ONLINE 0 0 0 raidz1 ONLINE 0 0 1.83M da8 ONLINE 0 0 0 da9 ONLINE 0 0 0 da10 ONLINE 0 0 0 da11 ONLINE 0 0 0 da12 ONLINE 0 0 0 da13 ONLINE 0 0 0 da14 ONLINE 0 0 0 raidz1 ONLINE 0 0 1.83M da16 ONLINE 0 0 0 spare ONLINE 0 0 0 da17 ONLINE 0 0 0 da7 ONLINE 0 0 0 da18 ONLINE 0 0 0 da19 ONLINE 0 0 0 da20 ONLINE 0 0 0 da21 ONLINE 0 0 0 da22 ONLINE 0 0 0 spares da15 AVAIL da23 AVAIL da7 INUSE currently in use errors: 241 data errors, use '-v' for a list I tried several combinations and ran benchmarks against ZFS in various RAID-Z configs and finally determined that how I laid out the disks didn't affect performance much. That was well before ZFS v13 was committed, so there were many bug fixes and performance optimizations since then. I deployed using the two configurations you see above. Both machines have a pair of Areca 1231ML RAID controllers with super-sized BBWC (battery backed write cache). On back01, each controller presents a 12- disk RAID-5 array and ZFS concatenates them into the zpool you see above. On back02, the RAID controller is configured in JBOD mode and disks are pooled as shown. In 17 months of production, the ZFS pool on back02 has required maintenance several times, including being down for days while a scrub was being run. Yes, days. Several times. We've had a couple data center power outages, and the only ZFS equipped backup servers that had any issue was back02. The last scrub failed to fixed the data errors. IIRC, the RAID controller write cache is not active in JBOD mode. That could explain why back02 had problems and the rest of my ZFS servers did not. When another disk in back02 fails, I'll move all the data off back02 and rebuild the disk arrays using hardware RAID. In addition to having zero disk errors, zero hardware problems, and zero ZFS data errors, the ZFS backup servers deployed on top of hardware RAID are much faster. How much faster? In the past 3 days, I have had a cleanup process running that prunes stale backups. On back01, the process has cleaned up 4TB of disk space. On back02, it has only cleaned up 1.2TB. These cleanup processes run while the machines are performing their 'normal' duties. On average, the back02 processes take about 3-4x longer to run on back02. It's not for lack of resources either. These are dual quad- cores with 16GB of RAM each. YMMV. Matt > In some old postings floating around the net a limit of 9 drives > is recommended. > Does this still apply to the current ZFS in 8.0? > > > Best regards, > > Solon Lutz > > > +-----------------------------------------------+ > | Pyro.Labs Berlin - Creativity for tomorrow | > | Wasgenstrasse 75/13 - 14129 Berlin, Germany | > | www.pyro.de - phone + 49 - 30 - 48 48 58 58 | > | info@pyro.de - fax + 49 - 30 - 80 94 03 52 | > +-----------------------------------------------+ > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" From owner-freebsd-fs@FreeBSD.ORG Tue Dec 15 23:53:17 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4C4CE106566B for ; Tue, 15 Dec 2009 23:53:17 +0000 (UTC) (envelope-from solon@pyro.de) Received: from srv23.fsb.echelon.bnd.org (mail.pyro.de [83.137.99.96]) by mx1.freebsd.org (Postfix) with ESMTP id F26B38FC14 for ; Tue, 15 Dec 2009 23:53:16 +0000 (UTC) Received: from port-87-193-183-44.static.qsc.de ([87.193.183.44] helo=MORDOR) by srv23.fsb.echelon.bnd.org with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69 (FreeBSD)) (envelope-from ) id 1NKhCa-000DTc-D0; Wed, 16 Dec 2009 00:53:15 +0100 Date: Wed, 16 Dec 2009 00:52:53 +0100 From: Solon Lutz X-Mailer: The Bat! (v3.99.25) Professional Organization: pyro.labs berlin X-Priority: 3 (Normal) Message-ID: <957649379.20091216005253@pyro.de> To: Matt Simerson In-Reply-To: <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Score: -1.4 (-) X-Spam-Report: Spam detection software, running on the system "srv23.fsb.echelon.bnd.org", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see The administrator of that system for details. Content preview: > I deployed using the two configurations you see above. Both machines > have a pair of Areca 1231ML RAID controllers with super-sized BBWC > (battery backed write cache). On back01, each controller presents a 12- > disk RAID-5 array and ZFS concatenates them into the zpool you see > above. On back02, the RAID controller is configured in JBOD mode and > disks are pooled as shown. [...] Content analysis details: (-1.4 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -1.4 ALL_TRUSTED Passed through trusted hosts only via SMTP X-Spam-Flag: NO Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 15 Dec 2009 23:53:17 -0000 > I deployed using the two configurations you see above. Both machines > have a pair of Areca 1231ML RAID controllers with super-sized BBWC > (battery backed write cache). On back01, each controller presents a 12- > disk RAID-5 array and ZFS concatenates them into the zpool you see > above. On back02, the RAID controller is configured in JBOD mode and > disks are pooled as shown. Why concatenate them into one pool and give up the redundancy? I have the same setup: Areca 24-port RAID6 (24x 500gb) NAME STATE READ WRITE CKSUM temp ONLINE 0 0 24 da0 ONLINE 0 0 48 And it very nearly killed itself after 28 months of flawless duty... All went fine until 4 drives disconnected themselves from the Areca due to faulty SATA-cables. This crashed the Areca in such a way, that I had to disconnect the battery module from the controller in order to get it initialized during boot-up. Cache gone - ZFS unable to mount 10TB pool - scrub failed - I/O errors This was three months ago and if I hadn't found an extremly skilled person who was able to manually find and distinguish between good and corrupted meta-data sets, replicate them in their proper spots and zero out corrupt transaction ids - I would have lost 10TB of data. (No backups - to expensive) Why do you use JBOD? You can configure a passthrough for all drives, explicitly degrading the Areca to a dumb sata controller... Best regards, Solon Lutz +-----------------------------------------------+ | Pyro.Labs Berlin - Creativity for tomorrow | | Wasgenstrasse 75/13 - 14129 Berlin, Germany | | www.pyro.de - phone + 49 - 30 - 48 48 58 58 | | info@pyro.de - fax + 49 - 30 - 80 94 03 52 | +-----------------------------------------------+ From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 00:02:03 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9AACB106566C for ; Wed, 16 Dec 2009 00:02:03 +0000 (UTC) (envelope-from hywel@hmallett.co.uk) Received: from lisbon.directrouter.com (lisbon.directrouter.com [72.249.30.130]) by mx1.freebsd.org (Postfix) with ESMTP id 7697A8FC0C for ; Wed, 16 Dec 2009 00:02:03 +0000 (UTC) Received: from hmallett.plus.com ([81.174.158.104] helo=[192.168.0.10]) by lisbon.directrouter.com with esmtpa (Exim 4.69) (envelope-from ) id 1NKfHL-0007fA-7s; Tue, 15 Dec 2009 15:49:59 -0600 Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=us-ascii From: Hywel Mallett In-Reply-To: <20091208224710.GA97620@duncan.reilly.home> Date: Tue, 15 Dec 2009 21:49:56 +0000 Content-Transfer-Encoding: quoted-printable Message-Id: <228D9370-4967-4C47-9746-8475DCD4FA27@hmallett.co.uk> References: <20091208224710.GA97620@duncan.reilly.home> To: Andrew Reilly X-Mailer: Apple Mail (2.1077) X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - lisbon.directrouter.com X-AntiAbuse: Original Domain - freebsd.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - hmallett.co.uk X-Source: X-Source-Args: X-Source-Dir: Cc: freebsd-fs@freebsd.org Subject: Re: On gjournal vs unexpected shutdown (-->fsck) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 00:02:03 -0000 On 8 Dec 2009, at 22:47, Andrew Reilly wrote: > Hi there, >=20 > I thought that I'd try a gjournal'd UFS on one of my spare > drives (so: dedicated to the task, formatted from clean, per the > instructions in the gjournal man page.) The filesystem itself > seems to be working swimmingly, although it isn't heavily used. > In the time that I've had it running, though, I've had two power > outages that have resulted in unexpected shutdowns, and I was > surprised to find that the boot process did nothing unexpected: > file system not marked clean: fsck before you can mount. So > both times I fsck'd the drive, and as near as I can tell this > took exactly as long as fsck on a regular UFS system of similar > size. Isn't the journalling operation supposed to confer a > shortcut benefit here? I know that the man page doesn't mention > recovery by journal play-back, but I thought that it didn't need > to: that's the whole point. Is there a step that I'm missing? > Perhaps a gjournal-aware version of fsck that I should run > instead of regular fsck, that will quickly mark the file system > clean? >=20 > (Running -current as of last weekend, if that matters.) >=20 I assume you've run tunefs -J enable on the filesystems on the = journalled provider? Or used newfs with -J if it's a new filesystem? If I remember correctly it's this flag that fsck checks to see whether = fsck is needed or not. You can check whether the flag is set or not by running dumpfs on the = filesystem. Under "flags" it'll say gjournal if the flag is set.= From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 02:52:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 682C4106566B for ; Wed, 16 Dec 2009 02:52:39 +0000 (UTC) (envelope-from ambsd@raisa.eu.org) Received: from raisa.eu.org (raisa.eu.org [83.17.178.202]) by mx1.freebsd.org (Postfix) with ESMTP id EB44E8FC0C for ; Wed, 16 Dec 2009 02:52:38 +0000 (UTC) Received: from bolt.zol (62-121-98-25.home.aster.pl [62.121.98.25]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by raisa.eu.org (Postfix) with ESMTP id E0387254; Wed, 16 Dec 2009 03:55:57 +0100 (CET) Content-Type: text/plain; charset=utf-8; format=flowed; delsp=yes To: "Ben Schumacher" References: <9859143f0912142036k3dd0758fmc9cee9b6f2ce4698@mail.gmail.com> Date: Wed, 16 Dec 2009 03:52:35 +0100 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit From: "Emil Smolenski" Message-ID: In-Reply-To: <9859143f0912142036k3dd0758fmc9cee9b6f2ce4698@mail.gmail.com> User-Agent: Opera Mail/10.10 (FreeBSD) Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: SUIDDIR on ZFS? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 02:52:39 -0000 On Tue, 15 Dec 2009 05:36:55 +0100, Ben Schumacher wrote: > At any rate, I've been considering switching this to a ZFS RAIDZ now > that FreeBSD 8 is released and it seems that folks think it's stable, > but I'm curious if it can provide the SUIDDIR functionality I'm > currently using. Yes, it can. From my point of view it works the same way as on UFS. -- am From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 05:05:28 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15E5D1065672 for ; Wed, 16 Dec 2009 05:05:28 +0000 (UTC) (envelope-from jonathan@kc8onw.net) Received: from mail.kc8onw.net (kc8onw.net [206.55.209.81]) by mx1.freebsd.org (Postfix) with ESMTP id CEED78FC0A for ; Wed, 16 Dec 2009 05:05:27 +0000 (UTC) Received: from [10.70.3.199] (c-98-226-147-124.hsd1.in.comcast.net [98.226.147.124]) by mail.kc8onw.net (Postfix) with ESMTPSA id E01FB1E134; Tue, 15 Dec 2009 23:49:30 -0500 (EST) Message-ID: <4B2866A5.3080207@kc8onw.net> Date: Tue, 15 Dec 2009 23:48:37 -0500 From: Jonathan User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.1.5) Gecko/20091204 Thunderbird/3.0 MIME-Version: 1.0 To: ivoras@gmail.com, freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: RE: ZFS, compression, system load, pauses (livelocks?) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 05:05:28 -0000 I seem to have run into the same problem with "2) long pauses, in what looks like vfs.zfs.txg.timeout second intervals" http://lists.freebsd.org/pipermail/freebsd-fs/2009-December/007343.html In my case 50-100% CPU is used by ZFS with *no* disk activity during the pauses then a burst of rapid disk activity and then another pause. I'm also not running compression on the file system that I am writing to so I don't think it's something specific to compression. Has anyone had any luck finding a solution or are people still just patching around it for now? I dropped vfs.zfs.txg.timeout from 30 to 5 seconds and my throughput is far better, but still sawtoothed. The actual data transfer "teeth" are much closer together but still seem to be spaced at vfs.zfs.txg.timeout intervals. When transferring data I see about 50% of a 1gb link which drops to 0 during the pauses. Based on gstat my disks spend maybe 1/4 of their time busy so I doubt my array is the limiting factor in this situation. I'm running 8-stable r200414 right now and I don't remember having this problem with 8-beta releases so maybe something has changed recently that triggered this? Jonathan Stewart Sorry for the broken threading. I've added freebsd-fs to my subscription list so I will be able to follow the rest of the discussion on the list. From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 08:59:34 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 72514106568D; Wed, 16 Dec 2009 08:59:34 +0000 (UTC) (envelope-from alexz@visp.ru) Received: from mail.visp.ru (srv1.visp.ru [91.215.204.2]) by mx1.freebsd.org (Postfix) with ESMTP id 266968FC13; Wed, 16 Dec 2009 08:59:33 +0000 (UTC) Received: from 91-215-205-255.static.visp.ru ([91.215.205.255] helo=zagrebin) by mail.visp.ru with esmtp (Exim 4.66 (FreeBSD)) (envelope-from ) id 1NKpjH-0003to-JI; Wed, 16 Dec 2009 11:59:31 +0300 From: "Alexander Zagrebin" To: , Date: Wed, 16 Dec 2009 11:59:31 +0300 Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset="koi8-r" Content-Transfer-Encoding: 7bit X-Mailer: Microsoft Office Outlook 11 Thread-Index: Acp+LhSOCSJqD7TlTcqPTB6QV0uSXQ== X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.5512 Cc: Subject: 8.0-RELEASE: disk IO temporarily hangs up (ZFS or ATA related problem) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 08:59:34 -0000 Hi! I use onboard ICH7 SATA controller with two disks attached: atapci1: port 0x30c8-0x30cf,0x30ec-0x30ef,0x30c0-0x30c7,0x30e8-0x30eb,0x30a0-0x30af irq 19 at device 31.2 on pci0 atapci1: [ITHREAD] ata2: on atapci1 ata2: [ITHREAD] ata3: on atapci1 ata3: [ITHREAD] ad4: 1430799MB at ata2-master SATA150 ad6: 1430799MB at ata3-master SATA150 The disks are used for mirrored ZFS pool. I have noticed that the system periodically locks up on disk operations. After approx. 10 min of very slow disk i/o (several KB/s) the speed of disk operations restores to normal. gstat has shown that the problem is in ad6. For example, there is a filtered output of iostat -x 1: extended device statistics device r/s w/s kr/s kw/s wait svc_t %b ad6 818.6 0.0 10840.2 0.0 0 0.4 34 ad6 300.6 642.0 3518.5 24830.3 50 24.8 72 ad6 1.0 639.3 63.7 17118.3 0 62.1 98 ad6 404.5 4.0 6837.7 4.0 0 0.5 18 ad6 504.5 0.0 13667.2 0.0 1 0.7 32 ad6 633.3 0.0 13190.3 0.0 1 0.7 38 ad6 416.3 384.5 8134.7 24606.2 0 16.3 57 ad6 538.9 76.7 9772.8 2982.2 55 2.9 40 ad6 31.9 929.5 801.0 37498.6 0 27.2 82 ad6 635.5 0.0 13087.1 0.0 1 0.6 35 ad6 579.6 0.0 16669.8 0.0 0 0.8 43 ad6 603.6 0.0 11697.4 0.0 1 0.7 40 ad6 538.0 0.0 10438.7 0.0 0 0.9 47 ad6 30.9 898.4 868.6 40585.4 0 36.6 78 ad6 653.3 86.6 8566.6 202.7 1 0.8 40 ad6 737.1 0.0 6429.4 0.0 1 0.6 42 ad6 717.1 0.0 3958.7 0.0 0 0.5 36 ad6 1179.5 0.0 2058.9 0.0 0 0.1 15 ad6 1191.2 0.0 1079.6 0.0 1 0.1 15 ad6 985.1 0.0 5093.9 0.0 0 0.2 23 ad6 761.8 0.0 9801.3 0.0 1 0.4 31 ad6 698.7 0.0 9215.1 0.0 0 0.4 30 ad6 434.2 513.9 5903.1 13658.3 48 10.2 55 ad6 3.0 762.8 191.2 28732.3 0 57.6 99 ad6 10.0 4.0 163.9 4.0 1 1.6 2 Before this line we have a normal operations. Then the behaviour of ad6 changes (pay attention to high average access time and percent of "busy" significantly greater than 100): ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 1.0 0.0 0.5 0.0 1 1798.3 179 ad6 1.0 0.0 1.5 0.0 1 1775.4 177 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 10.0 0.0 75.2 0.0 1 180.3 180 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 83.7 0.0 862.9 0.0 1 21.4 179 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 1.0 0.0 63.7 0.0 1 1707.4 170 ad6 1.0 0.0 9.0 0.0 0 1791.0 178 ad6 10.9 0.0 172.2 0.0 2 0.2 0 ad6 24.9 0.0 553.7 0.0 1 143.3 179 ad6 0.0 0.0 0.0 0.0 7 0.0 0 ad6 2.0 23.9 32.4 1529.9 1 336.3 177 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 68.7 0.0 1322.8 0.0 1 26.3 181 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 27.9 0.0 193.7 0.0 1 61.6 172 ad6 1.0 0.0 2.5 0.0 1 1777.4 177 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 1.0 0.0 2.0 0.0 1 1786.9 178 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 2.0 0.0 6.5 0.0 1 899.4 179 ad6 0.0 0.0 0.0 0.0 1 0.0 0 ad6 1.0 0.0 2.0 0.0 1 1786.7 178 ad6 0.0 0.0 0.0 0.0 1 0.0 0 And so on for about 10 minutes. Then the disk i/o is reverted to normal: ad6 139.4 0.0 8860.5 0.0 1 4.4 61 ad6 167.3 0.0 10528.7 0.0 1 3.3 55 ad6 60.8 411.5 3707.6 8574.8 1 19.6 87 ad6 163.4 0.0 10334.9 0.0 1 4.4 72 ad6 157.4 0.0 9770.7 0.0 1 5.0 78 ad6 108.5 0.0 6886.8 0.0 0 3.9 43 ad6 101.6 0.0 6381.4 0.0 0 2.6 27 ad6 109.6 0.0 7013.9 0.0 0 2.0 22 ad6 121.4 0.0 7769.7 0.0 0 2.4 29 ad6 92.5 0.0 5922.6 0.0 1 3.4 31 ad6 122.4 19.9 7833.0 1273.7 0 3.9 54 ad6 83.6 0.0 5349.5 0.0 0 3.9 33 ad6 5.0 0.0 318.4 0.0 0 8.1 4 There are no ata error messages neither in the system log, nor on the console. The manufacture's diagnostic test is passed on ad6 without any errors. The ad6 also contains swap partition. I have tried to run several (10..20) instances of dd, which read and write data from and to the swap partition simultaneously, but it has not called the lockup. So there is a probability that this problem is ZFS related. I have been forced to switch ad6 to the offline state... :( Any suggestions on this problem? -- Alexander Zagrebin From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 09:49:33 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 42281106566B; Wed, 16 Dec 2009 09:49:33 +0000 (UTC) (envelope-from borjam@sarenet.es) Received: from proxypop1.sarenet.es (proxypop1.sarenet.es [194.30.0.99]) by mx1.freebsd.org (Postfix) with ESMTP id F0F048FC14; Wed, 16 Dec 2009 09:49:32 +0000 (UTC) Received: from [172.16.1.204] (izaro.sarenet.es [192.148.167.11]) by proxypop1.sarenet.es (Postfix) with ESMTP id 2AC99BF32; Wed, 16 Dec 2009 10:49:30 +0100 (CET) Mime-Version: 1.0 (Apple Message framework v1077) Content-Type: text/plain; charset=us-ascii From: Borja Marcos In-Reply-To: <20091214154750.GF1666@garage.freebsd.pl> Date: Wed, 16 Dec 2009 10:49:29 +0100 Content-Transfer-Encoding: quoted-printable Message-Id: <86C70417-E465-4295-8B58-60EA8887ADA3@sarenet.es> References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> <20091214154750.GF1666@garage.freebsd.pl> To: Pawel Jakub Dawidek X-Mailer: Apple Mail (2.1077) Cc: freebsd-fs@freebsd.org, Martin Matuska , Ronald Klop Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 09:49:33 -0000 On Dec 14, 2009, at 4:47 PM, Pawel Jakub Dawidek wrote: > Martin, this is the panic report I was refering to. Could you please = try > to reproduce it? Maybe first with my patch to confirm it is = reproducible > and then with your patch to confirm it has no such problem? > I'd be very grateful if you could do that. I don't want something to = go > into the tree if there might be a problem with the patch. Unable to reproduce it now, but my system is now=20 FreeBSD 8.0-RELEASE-p1 FreeBSD 8.0-RELEASE-p1 #10: Tue Dec 15 11:27:31 = CET 2009 borjam@:/pool/newsrc/obj/pool/newsrc/src/sys/DEBUG amd64 I've applied the patch and done several make buildworlds and make = cleans.=20 Weird, this was one of the virtual machines that crashed often when = doing the zfs send/zfs receive tests. Maybe something was damaged? Borja. From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 12:04:46 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1A63E106568D for ; Wed, 16 Dec 2009 12:04:46 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (rrcs-24-73-246-106.sw.biz.rr.com [24.73.246.106]) by mx1.freebsd.org (Postfix) with ESMTP id B8B358FC1C for ; Wed, 16 Dec 2009 12:04:45 +0000 (UTC) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id nBGC4ejP029537; Wed, 16 Dec 2009 06:04:40 -0600 (CST) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=cCrwQA5cw9RHAp9/TaeP3kSUfVSZSMy2usK97rgxWhwJQYvrU3mqjlqxEwx05mvfe K1LZYnf/IVweHVhBs1rBO6Pej9GOAn7ID70hGbSCvBAKg7UGl0w9LXDUFrxGwNxkse/ Wa8MX/8Ud54wcyNYx2ZZnZ4uJlIjw7SqxyMLj/Q= Message-ID: <4B28CCD8.2090406@jrv.org> Date: Wed, 16 Dec 2009 06:04:40 -0600 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: Matt Simerson References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> In-Reply-To: <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 12:04:46 -0000 Results like this indicate a problem with the disk system. There is a problem with the driver, card, cables or disks, etc. The Areca probably has problems with JBOD mode (I was unable to get a different Areca RAID card to work reliably in JBOD mode this spring so I'm not surprised). If these are SATA drives try a card using the Silicon Image 3124 controller and the SIIS driver: for SAS try the LSI 3801e or 3081E-R cards. > $ ssh back02 zpool status > pool: back02 > state: ONLINE > status: One or more devices has experienced an error resulting in data > corruption. Applications may be affected. > action: Restore the file in question if possible. Otherwise restore the > entire pool from backup. > see: http://www.sun.com/msg/ZFS-8000-8A > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > back02 ONLINE 0 0 934K > raidz1 ONLINE 0 0 0 > da0 ONLINE 0 0 0 > da1 ONLINE 0 0 0 > da2 ONLINE 0 0 0 > da3 ONLINE 0 0 0 > da4 ONLINE 0 0 0 > da5 ONLINE 0 0 0 > da6 ONLINE 0 0 0 > raidz1 ONLINE 0 0 1.83M > da8 ONLINE 0 0 0 > da9 ONLINE 0 0 0 > da10 ONLINE 0 0 0 > da11 ONLINE 0 0 0 > da12 ONLINE 0 0 0 > da13 ONLINE 0 0 0 > da14 ONLINE 0 0 0 > raidz1 ONLINE 0 0 1.83M > da16 ONLINE 0 0 0 > spare ONLINE 0 0 0 > da17 ONLINE 0 0 0 > da7 ONLINE 0 0 0 > da18 ONLINE 0 0 0 > da19 ONLINE 0 0 0 > da20 ONLINE 0 0 0 > da21 ONLINE 0 0 0 > da22 ONLINE 0 0 0 > spares > da15 AVAIL > da23 AVAIL > da7 INUSE currently in use > > errors: 241 data errors, use '-v' for a list From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 12:26:39 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7E10B106568B for ; Wed, 16 Dec 2009 12:26:39 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id 21FEC8FC12 for ; Wed, 16 Dec 2009 12:26:38 +0000 (UTC) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id nBGCQbAq032325; Wed, 16 Dec 2009 06:26:38 -0600 (CST) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=mbREQhyOaSXrLfyiHEXd5qRWJQj6zGm17WH1Shw6Z/mfBuuRNIl4l2PkKJ/gH4d3I gQ16jZYOFncfhAYvr8adey82TNRmw6y//SPjUMGyTVoLf6Hh9L8cyTqZfkMqkHgZtEz GYpTHiYSODzoNjjAOJuG8FYLGwtv18Pe1kgsTTA= Message-ID: <4B28D1FD.4040507@jrv.org> Date: Wed, 16 Dec 2009 06:26:37 -0600 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: Borja Marcos References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> <20091214154750.GF1666@garage.freebsd.pl> <86C70417-E465-4295-8B58-60EA8887ADA3@sarenet.es> In-Reply-To: <86C70417-E465-4295-8B58-60EA8887ADA3@sarenet.es> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 12:26:39 -0000 Borja Marcos wrote: > Weird, this was one of the virtual machines that crashed often when doing the zfs send/zfs receive tests. Maybe something was damaged? Using ZFS in a Virtual Machine is a risky business since many VM hosts (VMware and VirtualBox at least) discard any SYNCHRONIZE issued by the guest. VirtualBox has a way to turn SYNCHRONIZE functionality back on. Make sure you know how the VM software handles SYNCHRONIZE or FLUSH before using ZFS there. From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 17:55:01 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9134D1065672; Wed, 16 Dec 2009 17:55:01 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 5FE5F8FC25; Wed, 16 Dec 2009 17:55:01 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBGHt17w066927; Wed, 16 Dec 2009 17:55:01 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBGHt1je066923; Wed, 16 Dec 2009 17:55:01 GMT (envelope-from linimon) Date: Wed, 16 Dec 2009 17:55:01 GMT Message-Id: <200912161755.nBGHt1je066923@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: misc/141685: [zfs] zfs corruption on adaptec 5805 raid controller X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 17:55:01 -0000 Old Synopsis: zfs corruption on adaptec 5805 raid controller New Synopsis: [zfs] zfs corruption on adaptec 5805 raid controller Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Dec 16 17:54:45 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=141685 From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 20:43:27 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4761C1065670 for ; Wed, 16 Dec 2009 20:43:27 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: from mail-px0-f182.google.com (mail-px0-f182.google.com [209.85.216.182]) by mx1.freebsd.org (Postfix) with ESMTP id 25B2C8FC18 for ; Wed, 16 Dec 2009 20:43:26 +0000 (UTC) Received: by pxi12 with SMTP id 12so956361pxi.3 for ; Wed, 16 Dec 2009 12:43:26 -0800 (PST) Received: by 10.115.81.24 with SMTP id i24mr1024550wal.194.1260996206371; Wed, 16 Dec 2009 12:43:26 -0800 (PST) Received: from mattintosh.spry.com (isaid.donotdelete.com [64.79.222.10]) by mx.google.com with ESMTPS id 22sm937640pzk.10.2009.12.16.12.43.25 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 16 Dec 2009 12:43:25 -0800 (PST) From: Matt Simerson To: Solon Lutz In-Reply-To: <957649379.20091216005253@pyro.de> X-Priority: 3 (Normal) References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> Message-Id: <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v936) Date: Wed, 16 Dec 2009 12:43:24 -0800 X-Mailer: Apple Mail (2.936) Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 20:43:27 -0000 On Dec 15, 2009, at 3:52 PM, Solon Lutz wrote: >> I deployed using the two configurations you see above. Both machines >> have a pair of Areca 1231ML RAID controllers with super-sized BBWC >> (battery backed write cache). On back01, each controller presents a >> 12- >> disk RAID-5 array and ZFS concatenates them into the zpool you see >> above. On back02, the RAID controller is configured in JBOD mode and >> disks are pooled as shown. > > Why concatenate them into one pool and give up the redundancy? I didn't need redundant redundancy. > I have the same setup: Areca 24-port RAID6 (24x 500gb) > > NAME STATE READ WRITE CKSUM > temp ONLINE 0 0 24 > da0 ONLINE 0 0 48 > > And it very nearly killed itself after 28 months of flawless duty... > All went fine until 4 drives disconnected themselves from the Areca > due > to faulty SATA-cables. This crashed the Areca in such a way, that I > had > to disconnect the battery module from the controller in order to get > it > initialized during boot-up. > > Cache gone - ZFS unable to mount 10TB pool - scrub failed - I/O errors > > This was three months ago and if I hadn't found an extremly skilled > person > who was able to manually find and distinguish between good and > corrupted > meta-data sets, replicate them in their proper spots and zero out > corrupt > transaction ids - I would have lost 10TB of data. (No backups - to > expensive) > > Why do you use JBOD? You can configure a passthrough for all drives, > explicitly degrading the Areca to a dumb sata controller... Why would I bother? Both ways present each disk to FreeBSD. Based on my understanding (and an answer received from Areca support), the only reason I'd bother manually configuring some disks for passthrough is if I wanted to use some disks in a RAID array and others as raw disks. Configuring JBOD mode configures ALL the disks on the controller as passthrough devices. Matt From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 21:20:28 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8C784106568F for ; Wed, 16 Dec 2009 21:20:28 +0000 (UTC) (envelope-from wonslung@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id 16D628FC12 for ; Wed, 16 Dec 2009 21:20:27 +0000 (UTC) Received: by ewy26 with SMTP id 26so518723ewy.3 for ; Wed, 16 Dec 2009 13:20:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=8H652RzzJPYx0cqmvYUsJ+XrFgvBAzIeVbFdRI0Q/T8=; b=BcduygEAxkjRYeCHhZWEZTIKc7xX0dJ6kyLrcmUt5ykZ3ZYha27V6by3uKssiRxjTg Xv7PZL0HCUu32gibrXJ93mJmc3bA/sQdfA6xzH5PnWoWXL9grcbD3YlofdttqqpuohQ7 zX1qxOwWpZF+lfI+OfWPFzwo/4NBuf0//ISsM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=TjVkp1qc05ldoUS2ICS2kC2lnfi8agB4U3lTuDeVQDKsBropMDY2TuabQxVRbqPprF KFn1HL4FzvCQr3adsWOGZ0ls7yFXTgZ7OLl1q7Z1TuFyCrTB7dB5ZPbwpcq2yt22urhb i8IlhTZG7mouY54ZbKRAcwVtwtG2EtNdr5sfE= MIME-Version: 1.0 Received: by 10.216.90.13 with SMTP id d13mr565105wef.130.1260998426964; Wed, 16 Dec 2009 13:20:26 -0800 (PST) In-Reply-To: <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> Date: Wed, 16 Dec 2009 16:20:26 -0500 Message-ID: From: Thomas Burgess To: Matt Simerson Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 21:20:28 -0000 On Wed, Dec 16, 2009 at 3:43 PM, Matt Simerson wrote: > > On Dec 15, 2009, at 3:52 PM, Solon Lutz wrote: > > I deployed using the two configurations you see above. Both machines >>> have a pair of Areca 1231ML RAID controllers with super-sized BBWC >>> (battery backed write cache). On back01, each controller presents a 12- >>> disk RAID-5 array and ZFS concatenates them into the zpool you see >>> above. On back02, the RAID controller is configured in JBOD mode and >>> disks are pooled as shown. >>> >> >> Why concatenate them into one pool and give up the redundancy? >> > > I didn't need redundant redundancy. > > > I have the same setup: Areca 24-port RAID6 (24x 500gb) >> >> NAME STATE READ WRITE CKSUM >> temp ONLINE 0 0 24 >> da0 ONLINE 0 0 48 >> >> And it very nearly killed itself after 28 months of flawless duty... >> All went fine until 4 drives disconnected themselves from the Areca due >> to faulty SATA-cables. This crashed the Areca in such a way, that I had >> to disconnect the battery module from the controller in order to get it >> initialized during boot-up. >> >> Cache gone - ZFS unable to mount 10TB pool - scrub failed - I/O errors >> >> This was three months ago and if I hadn't found an extremly skilled person >> who was able to manually find and distinguish between good and corrupted >> meta-data sets, replicate them in their proper spots and zero out corrupt >> transaction ids - I would have lost 10TB of data. (No backups - to >> expensive) >> >> Why do you use JBOD? You can configure a passthrough for all drives, >> explicitly degrading the Areca to a dumb sata controller... >> > > Why would I bother? Both ways present each disk to FreeBSD. Based on my > understanding (and an answer received from Areca support), the only reason > I'd bother manually configuring some disks for passthrough is if I wanted to > use some disks in a RAID array and others as raw disks. Configuring JBOD > mode configures ALL the disks on the controller as passthrough devices. > > I think the main reason is that ZFS is better when it has raw drives. Some of the features of ZFS don't work as well without having access to the drives in this way, and other features don't work at all. In general, it's always best to let ZFS handle the raid stuff and not use the hardware raid settings. > Matt > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 21:46:31 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 173781065672; Wed, 16 Dec 2009 21:46:31 +0000 (UTC) (envelope-from pjd@garage.freebsd.pl) Received: from mail.garage.freebsd.pl (chello089077043238.chello.pl [89.77.43.238]) by mx1.freebsd.org (Postfix) with ESMTP id 4F09C8FC13; Wed, 16 Dec 2009 21:46:29 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id B63BB45EAE; Wed, 16 Dec 2009 22:46:26 +0100 (CET) Received: from localhost (chello089077043238.chello.pl [89.77.43.238]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 372BC4569A; Wed, 16 Dec 2009 22:46:20 +0100 (CET) Date: Wed, 16 Dec 2009 22:46:21 +0100 From: Pawel Jakub Dawidek To: Martin Matuska Message-ID: <20091216214621.GA4217@garage.freebsd.pl> References: <20091029205121.GB3418@garage.freebsd.pl> <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> <20091214154750.GF1666@garage.freebsd.pl> <495F94EF-8F57-440D-8810-F40E40DE69D5@sarenet.es> <4B26B08E.5000203@FreeBSD.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="3MwIy2ne0vdjdPXF" Content-Disposition: inline In-Reply-To: <4B26B08E.5000203@FreeBSD.org> User-Agent: Mutt/1.4.2.3i X-PGP-Key-URL: http://people.freebsd.org/~pjd/pjd.asc X-OS: FreeBSD 9.0-CURRENT i386 X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org, Ronald Klop Subject: Re: zfs receive gives: internal error: Argument list too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 21:46:31 -0000 --3MwIy2ne0vdjdPXF Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Dec 14, 2009 at 10:39:26PM +0100, Martin Matuska wrote: > I was unable to reproduce the panic (with 8.0-RELEASE-p1 + Pawel's patch > or with my patch). >=20 > I can split my patch into two Opensolaris changesets - 8986, that is > exactly pjd's patch. The other changeset is 7994. > BUG ID 6764159: restore_object() makes a call that can block while > having a tx open but not yet committed. >=20 > So to make life easier, I have split this and use 2 patches (that make > together my old patch) > a) 6764159_restore_blocking.patch > b) zfs_recv_E2BIG.patch >=20 > I have also encountered a problem with recursive zfs snapshots of > previsously transferred datasets. > On many of my systems, zfs snapshot -r tank@xyz just did not work with > the following error: zfs snapshot -r failed because filesystem was busy >=20 > Patch links: > http://mfsbsd.vx.sk/patches/6764159_restore_blocking.patch > http://mfsbsd.vx.sk/patches/6462803_zfs_snapshot_busy.patch > http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch >=20 > Related OpenSolaris links: > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6462803 (zfs > snapshot busy) > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6764159 > (restore_object blocking) > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=3D6801979 (zfs > receive E2BIG) >=20 > I am running all three patches on about 30-40 servers with 8 CPU cores, > amd64 and intensive zfs snapshot -r, intense zfs send/receive operations > for several days. > No panics or other problems by now. Martin, please go ahead and commit your patch. Thank you for looking into this! --=20 Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! --3MwIy2ne0vdjdPXF Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFLKVUsForvXbEpPzQRAmyRAJ9yPodPvgok3czhFnH/9BEpMEOkuQCcD3l9 n6O7GZtTK8LT/OkiUcWskek= =OOXY -----END PGP SIGNATURE----- --3MwIy2ne0vdjdPXF-- From owner-freebsd-fs@FreeBSD.ORG Wed Dec 16 23:19:42 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8FF041065695 for ; Wed, 16 Dec 2009 23:19:42 +0000 (UTC) (envelope-from matt@corp.spry.com) Received: from fg-out-1718.google.com (fg-out-1718.google.com [72.14.220.156]) by mx1.freebsd.org (Postfix) with ESMTP id 1B5298FC0C for ; Wed, 16 Dec 2009 23:19:41 +0000 (UTC) Received: by fg-out-1718.google.com with SMTP id 16so971299fgg.13 for ; Wed, 16 Dec 2009 15:19:41 -0800 (PST) Received: by 10.102.214.22 with SMTP id m22mr818785mug.54.1261005580861; Wed, 16 Dec 2009 15:19:40 -0800 (PST) Received: from mattintosh.spry.com (isaid.donotdelete.com [64.79.222.10]) by mx.google.com with ESMTPS id 14sm3742880muo.34.2009.12.16.15.19.38 (version=TLSv1/SSLv3 cipher=RC4-MD5); Wed, 16 Dec 2009 15:19:39 -0800 (PST) Message-Id: From: Matt Simerson To: freebsd-fs@freebsd.org In-Reply-To: Mime-Version: 1.0 (Apple Message framework v936) Date: Wed, 16 Dec 2009 15:19:35 -0800 References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> X-Mailer: Apple Mail (2.936) Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 16 Dec 2009 23:19:42 -0000 On Dec 16, 2009, at 1:20 PM, Thomas Burgess wrote: > On Wed, Dec 16, 2009 at 3:43 PM, Matt Simerson > wrote: > > On Dec 15, 2009, at 3:52 PM, Solon Lutz wrote: > > Why do you use JBOD? You can configure a passthrough for all drives, > explicitly degrading the Areca to a dumb sata controller... > > Why would I bother? Both ways present each disk to FreeBSD. Based > on my understanding (and an answer received from Areca support), the > only reason I'd bother manually configuring some disks for > passthrough is if I wanted to use some disks in a RAID array and > others as raw disks. Configuring JBOD mode configures ALL the disks > on the controller as passthrough devices. > > I think the main reason is that ZFS is better when it has raw drives. I've heard that numerous times. Perhaps it is true in some cases. Such as when using a RAID controller from 1999. Or a $30 RAID adapter. I've built several ZFS systems using on-board SATA/SAS controllers, a couple of 24-disk systems with the Marvell SATA controllers used in the Sun x4500, and three 24-disk systems using the Areca 1231ML. Using the Areca as a hardware RAID controller with RAID volumes has proven to perform better and be much more reliable than when using raw disks. > Some of the features of ZFS don't work as well without having access > to the drives in this way, and other features don't work at all. The last time I compared the performance of ZFS using dumb Marvell SATA controllers versus the Areca with RAID, them features you speak of weren't worth the bits used to say them. On the two systems I described in this thread, the one using RAID significantly outperforms the one configured as JBOD. And in the case of the Areca, JBOD = passthrough = raw disks. > In general, it's always best to let ZFS handle the raid stuff and > not use the hardware raid settings. Because you said so? I'd like to see some evidence to back that statement up. The only time I've seen better ZFS performance numbers than what I'm getting with FreeBSD 8 ZFS + Areca RAID6 is when I tested OpenSolaris with them Marvell SATA controllers. But that was in August of 2008, and ZFS on FreeBSD performs much better now. Some updated benchmarks would be welcome. Matt From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 00:05:31 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5808F1065694 for ; Thu, 17 Dec 2009 00:05:31 +0000 (UTC) (envelope-from wonslung@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id D8AF18FC1A for ; Thu, 17 Dec 2009 00:05:30 +0000 (UTC) Received: by ewy26 with SMTP id 26so671801ewy.3 for ; Wed, 16 Dec 2009 16:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=Vftpmn7hRQwa/9n2iUL35gsXc2LQc5NkFipny+XsV1Y=; b=bZu1d9lOAL4zI6SYuk6R8PrRWGQS5BhdcDAY04XcG79xf9vwsxQYHuYnZvDi4DuzLz JEHsV31AcTqXrFV2Hq8jaaesbHDkXnUF8TxTxZK09cCm6+MqYpKlz4klMN3pd3bmfeNM QXnezR2nSvRZ/eYLwCHUMCdHclK3uJn3hww5s= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=ZSG6JiViO1/j8fN5q2/+Qx+oThG2Y9Vf5PGEj1lG0I9r4y9TcD4wvZ7A3kq7ACP7n2 PYkv4K2sVuHWXG+gczFxFMgyBkSq/6NHULXgGzUfMEEeJOP0Yu+XINQX8NVu7B5vNmFH O67Ff4NUjUXK79O0i9mk72mWSKpArro5iagJk= MIME-Version: 1.0 Received: by 10.216.88.212 with SMTP id a62mr648130wef.72.1261008329707; Wed, 16 Dec 2009 16:05:29 -0800 (PST) In-Reply-To: References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> Date: Wed, 16 Dec 2009 19:05:29 -0500 Message-ID: From: Thomas Burgess To: Matt Simerson Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 00:05:31 -0000 On Wed, Dec 16, 2009 at 6:19 PM, Matt Simerson wrote: > > On Dec 16, 2009, at 1:20 PM, Thomas Burgess wrote: > > On Wed, Dec 16, 2009 at 3:43 PM, Matt Simerson >> wrote: >> >> On Dec 15, 2009, at 3:52 PM, Solon Lutz wrote: >> >> Why do you use JBOD? You can configure a passthrough for all drives, >> explicitly degrading the Areca to a dumb sata controller... >> >> Why would I bother? Both ways present each disk to FreeBSD. Based on my >> understanding (and an answer received from Areca support), the only reason >> I'd bother manually configuring some disks for passthrough is if I wanted to >> use some disks in a RAID array and others as raw disks. Configuring JBOD >> mode configures ALL the disks on the controller as passthrough devices. >> >> I think the main reason is that ZFS is better when it has raw drives. >> > > I've heard that numerous times. Perhaps it is true in some cases. Such as > when using a RAID controller from 1999. Or a $30 RAID adapter. > > I've built several ZFS systems using on-board SATA/SAS controllers, a > couple of 24-disk systems with the Marvell SATA controllers used in the Sun > x4500, and three 24-disk systems using the Areca 1231ML. Using the Areca as > a hardware RAID controller with RAID volumes has proven to perform better > and be much more reliable than when using raw disks. > > This is true regardless. ZFS is, by design, a software raid system. The performance of ZFS comes from CPU and ram, not from expensive hardware raid cards. The entire POINT of using ZFS is to get great performance and data integrity with commodity hardware. I'm not saying ZFS doesn't work with hardware raid, it does. ZFS's redundancy and self healing features are designed for raw drives. If you trust your hardware, then by all means, use it. I'll stick to using ZFS the way it was intended by the developers. > > Some of the features of ZFS don't work as well without having access to >> the drives in this way, and other features don't work at all. >> > > The last time I compared the performance of ZFS using dumb Marvell SATA > controllers versus the Areca with RAID, them features you speak of weren't > worth the bits used to say them. On the two systems I described in this > thread, the one using RAID significantly outperforms the one configured as > JBOD. And in the case of the Areca, JBOD = passthrough = raw disks. > > So data integrity isnt' worth the bits? As far as your systems performance goes, that's great. I wasn't arguing with you about that. I was just pointing out that the guy who posted before me probably meant that ZFS performs better with raw devices. > > In general, it's always best to let ZFS handle the raid stuff and not use >> the hardware raid settings. >> > > Because you said so? > > I'd like to see some evidence to back that statement up. The only time I've > seen better ZFS performance numbers than what I'm getting with FreeBSD 8 ZFS > + Areca RAID6 is when I tested OpenSolaris with them Marvell SATA > controllers. But that was in August of 2008, and ZFS on FreeBSD performs > much better now. Some updated benchmarks would be welcome. > > > i'm not asking you to take my word for it, i'm just telling you what is common knowledge among ZFS users and developers. > Matt > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 02:52:32 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4A6B4106566C for ; Thu, 17 Dec 2009 02:52:31 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id 983A48FC13 for ; Thu, 17 Dec 2009 02:52:30 +0000 (UTC) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id nBH2qQhZ049976; Wed, 16 Dec 2009 20:52:26 -0600 (CST) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=R/Ln/BPr+KRmc7YzCMOJZNPWevNxuuPNC5Ud3z+wEjXutveG+6p1JFEEiFEvzr8D4 8O+9WWes2tJrPWi/RIF9GxK9fmJgpXaupRGPJN+UAg3rXc6fRiaI5hOHHEOW+5lI2Yz uxH8cuKnN4Xm+bKxFSQ6l0LeVO8nt+qqrInIRTE= Message-ID: <4B299CEA.3070705@jrv.org> Date: Wed, 16 Dec 2009 20:52:26 -0600 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 To: Matt Simerson References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-2 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 02:52:32 -0000 Matt Simerson wrote: >> In general, it's always best to let ZFS handle the raid stuff and not >> use the hardware raid settings. > > I'd like to see some evidence to back that statement up. The only time > I've seen better ZFS performance numbers than what I'm getting with > FreeBSD 8 ZFS + Areca RAID6 is when I tested OpenSolaris with them > Marvell SATA controllers. But that was in August of 2008, and ZFS on > FreeBSD performs much better now. Some updated benchmarks would be > welcome. You're fixating on throughput. Moreover, Marvell is the wrong controller for FreeBSD 8: use the SIIS driver with a controller based on the Silicon Image 3124. Doing redundancy in ZFS rather than the controller has two obvious advantages: 1. ZFS has access the filesystem checksums, signatures and txtags whereas RAID controllers do not. ZFS can always tell when data recovery is needed, which copy of the data is correct in a mirror, and which disk in a striped-parity set needs reconstruction. And ZFS can tell if no recovery was possible. A RAID controller can't always do this right. 2. More important, by letting ZFS handle RAID you can spread the risk across controllers and drivers as well as disks. For example, my home pool is an array of MIRRORs, each MIRROR carefully arranged so that if one component is a Seagate disk, the other is *not* a Seagate. For every mirror the two disks are in different enclosures (using different power supplies), each enclosure set connected to a different host controller. And I'm about to put half of the disks in SAS enclosures connected to LSI 3801e controllers using the MPT driver: once done I'll be fully redundant with respect to disk drives, enclosures & disk power supplies, cables, controllers and drivers. From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 03:07:08 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4DD8F1065672 for ; Thu, 17 Dec 2009 03:07:08 +0000 (UTC) (envelope-from areilly@bigpond.net.au) Received: from nschwmtas02p.mx.bigpond.com (nschwmtas02p.mx.bigpond.com [61.9.189.140]) by mx1.freebsd.org (Postfix) with ESMTP id CCAAE8FC0A for ; Thu, 17 Dec 2009 03:07:07 +0000 (UTC) Received: from nschwotgx03p.mx.bigpond.com ([124.188.161.100]) by nschwmtas02p.mx.bigpond.com with ESMTP id <20091217030705.VIGU2264.nschwmtas02p.mx.bigpond.com@nschwotgx03p.mx.bigpond.com>; Thu, 17 Dec 2009 03:07:05 +0000 Received: from duncan.reilly.home ([124.188.161.100]) by nschwotgx03p.mx.bigpond.com with ESMTP id <20091217030705.VNRF5111.nschwotgx03p.mx.bigpond.com@duncan.reilly.home>; Thu, 17 Dec 2009 03:07:05 +0000 Date: Thu, 17 Dec 2009 14:07:05 +1100 From: Andrew Reilly To: David N Message-ID: <20091217030705.GA20153@duncan.reilly.home> References: <20091208224710.GA97620@duncan.reilly.home> <228D9370-4967-4C47-9746-8475DCD4FA27@hmallett.co.uk> <20091215221727.GA8137@duncan.reilly.home> <4d7dd86f0912161725m6278c843xba275038c6a80d59@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4d7dd86f0912161725m6278c843xba275038c6a80d59@mail.gmail.com> User-Agent: Mutt/1.4.2.3i X-Authentication-Info: Submitted using SMTP AUTH LOGIN at nschwotgx03p.mx.bigpond.com from [124.188.161.100] using ID areilly@bigpond.net.au at Thu, 17 Dec 2009 03:07:05 +0000 X-RPD-ScanID: Class unknown; VirusThreatLevel unknown, RefID str=0001.0A150202.4B29A059.00E1,ss=1,fgs=0 X-SIH-MSG-ID: rB4xGdb6TAD0zmQs0WyzOwJxyArnqyN48Z4QX81loRIGTUDCp8DeQ9rHK+ZRtdu1xD9LJhqGNGEnaazhTY3RstCK Cc: freebsd-fs@freebsd.org Subject: Re: On gjournal vs unexpected shutdown (-->fsck) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 03:07:08 -0000 On Thu, Dec 17, 2009 at 12:25:00PM +1100, David N wrote: > 2009/12/16 Andrew Reilly : > > On Tue, Dec 15, 2009 at 09:49:56PM +0000, Hywel Mallett wrote: > >> > >> On 8 Dec 2009, at 22:47, Andrew Reilly wrote: > Do you have soft updates enabled? No. > can you show us a print out of > > tunefs -p /dev/.....journal Sure: (I hadn't seen the -p option before: neat!) duncan [202]$ tunefs -p /dev/ad10.journal tunefs: ACLs: (-a) disabled tunefs: MAC multilabel: (-l) disabled tunefs: soft updates: (-n) disabled tunefs: gjournal: (-J) enabled tunefs: maximum blocks per file in a cylinder group: (-e) 2048 tunefs: average file size: (-f) 16384 tunefs: average number of files in a directory: (-s) 64 tunefs: minimum percentage of free space: (-m) 8% tunefs: optimization preference: (-o) time tunefs: volume label: (-L) I have a suspicion that what happened was probably mostly a misunderstanding on my part about how and when the journal playback is initiated. Memory is getting dim at this point: the power outage that brought this issue up was a week or so ago. It seems plausible that since my other (non-journalled) drives were dirty too, I just ran fsck on everything. I expected fsck on the ad10.journal drive to just say "hey this is clean", but it went and did a full, slow, check. But that happens when you run fsck on clean non-journalled drives too, I think, so shouldn't have surprised me. I guess the surprise is that the system claimed that the journalled drive was dirty at all. Maybe it didn't: it's hard to remember now. I'll have to pay more attention the next time there's a power outage (something (not yet identified) is tickling the earth leakage circuit every so often: very annoying.) Cheers, -- Andrew From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 03:08:42 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 88B64106568D for ; Thu, 17 Dec 2009 03:08:42 +0000 (UTC) (envelope-from wonslung@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id 12CBE8FC12 for ; Thu, 17 Dec 2009 03:08:41 +0000 (UTC) Received: by ewy26 with SMTP id 26so786176ewy.3 for ; Wed, 16 Dec 2009 19:08:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=SF1ee409GeSVVHY01xNL1bVAcBSdu7qK9ldz9/owVDc=; b=d3H+3loJZHmQ8l/bjXO4kVWGfRKv7XGumoNUyUGnKXug99qqQxZvxoRt+0pgs9wcGB ys4AgYxCYS3JtJDLbnBaTR47sPYLOSWxPncyQDpaAyTv3k12x/dxg79hob3sZghZjsvU PhZ2zC8fzan/7mov0OI7Uw7hZgCEi37h1aKD8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=DCnrE5s94k5PpIUsMA8fdiUoLJH8162mBiAfRMYimU5DSd5cnH0rRJto+o+NHIjpH2 nlxDbzrK5teAOw+nzfqOT7P6Fdygw4L2lyu8HrcSGW6KXREHsjH+GB2CP5hSZg5Wops+ NimtzhXptEhyH3rmGAWY5uildjRSJX+z0TgnI= MIME-Version: 1.0 Received: by 10.216.86.144 with SMTP id w16mr699400wee.59.1261019321043; Wed, 16 Dec 2009 19:08:41 -0800 (PST) In-Reply-To: <4B299CEA.3070705@jrv.org> References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> <4B299CEA.3070705@jrv.org> Date: Wed, 16 Dec 2009 22:08:41 -0500 Message-ID: From: Thomas Burgess To: "James R. Van Artsdalen" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 03:08:42 -0000 2009/12/16 James R. Van Artsdalen > Matt Simerson wrote: > > >> In general, it's always best to let ZFS handle the raid stuff and not > >> use the hardware raid settings. > > > > I'd like to see some evidence to back that statement up. The only time > > I've seen better ZFS performance numbers than what I'm getting with > > FreeBSD 8 ZFS + Areca RAID6 is when I tested OpenSolaris with them > > Marvell SATA controllers. But that was in August of 2008, and ZFS on > > FreeBSD performs much better now. Some updated benchmarks would be > > welcome. > > You're fixating on throughput. Moreover, Marvell is the wrong > controller for FreeBSD 8: use the SIIS driver with a controller based on > the Silicon Image 3124. > > Doing redundancy in ZFS rather than the controller has two obvious > advantages: > > 1. ZFS has access the filesystem checksums, signatures and txtags > whereas RAID controllers do not. ZFS can always tell when data recovery > is needed, which copy of the data is correct in a mirror, and which disk > in a striped-parity set needs reconstruction. And ZFS can tell if no > recovery was possible. A RAID controller can't always do this right. > > 2. More important, by letting ZFS handle RAID you can spread the risk > across controllers and drivers as well as disks. > > For example, my home pool is an array of MIRRORs, each MIRROR carefully > arranged so that if one component is a Seagate disk, the other is *not* > a Seagate. For every mirror the two disks are in different enclosures > (using different power supplies), each enclosure set connected to a > different host controller. And I'm about to put half of the disks in > SAS enclosures connected to LSI 3801e controllers using the MPT driver: > once done I'll be fully redundant with respect to disk drives, > enclosures & disk power supplies, cables, controllers and drivers. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > exactly. One thing most people don't know about hard drives in general is that sometimes up to 30% of the space is actually ECC. With software raid systems like ZFS, this will eventually be somethign that we can take advantage of. The zfs checksums (only?) work best when you give each it raw drives. Because of this, you can imagine a scenario where allowing ZFS to use this ECC space as raw storage, while leaving the data corrections to ZFS would be ideal. It's not only a matter of space, it will also lead to nice improvements in speed. (more data can be read/written by the head as it passes) Also, ZFS can take advantage of the drives write cache in ways that a hardware raid controller cannot. I'm not saying hardware raid doesn't perform well, it does. It's just a waste of money when you're using ZFS. Your money is MUCH better spent on bigger drives, more ram and maybe an SSD or two From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 06:37:32 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 479B6106566C; Thu, 17 Dec 2009 06:37:32 +0000 (UTC) (envelope-from benschumacher@gmail.com) Received: from mail-pw0-f44.google.com (mail-pw0-f44.google.com [209.85.160.44]) by mx1.freebsd.org (Postfix) with ESMTP id 15B888FC13; Thu, 17 Dec 2009 06:37:32 +0000 (UTC) Received: by pwi15 with SMTP id 15so1222810pwi.3 for ; Wed, 16 Dec 2009 22:37:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=1E0FpkDu5ihtI7o+fX4bGcKNHSrDs7doNQjgzYsD/YQ=; b=WR2hnPEMYvGdjTDFnFzkdn3UwIjXYamaH0GyNwD47GkycikUZHIrR6YG1JWZIXnGUM 0sdplNVcf+ZauVf8+4SaKuHUvCPEm2fVxyRO77wGu0vX78i1xXcH6khPrkH04JGrYKXp MKjDEEuE2Au0ZQbIvofEKVESLiSiKugsAzc/M= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=mHKPeBbqIIrNlJX9dxQFUHVR0nj3HYELaiUDgyYg+nI5BNiMnO4SkaXCM623lWu3MQ r7kpGKTRJBoRi4XEG/WwGOgK82463d7CNFro8MT2EUamOLzGXmFY8fD6vAjtp62GT4WY QNYpLhq/gD1mAsONTRN2q7UgdMEl6H3zPhGHw= MIME-Version: 1.0 Sender: benschumacher@gmail.com Received: by 10.142.1.24 with SMTP id 24mr1394430wfa.108.1261031851627; Wed, 16 Dec 2009 22:37:31 -0800 (PST) In-Reply-To: References: <9859143f0912142036k3dd0758fmc9cee9b6f2ce4698@mail.gmail.com> Date: Wed, 16 Dec 2009 23:37:31 -0700 X-Google-Sender-Auth: f9171b267f75b27a Message-ID: <9859143f0912162237q50fe147ej428905abf63c61b@mail.gmail.com> From: Ben Schumacher To: Emil Smolenski Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: SUIDDIR on ZFS? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 06:37:32 -0000 On Tue, Dec 15, 2009 at 7:52 PM, Emil Smolenski wrote: > On Tue, 15 Dec 2009 05:36:55 +0100, Ben Schumacher > wrote: > >> At any rate, I've been considering switching this to a ZFS RAIDZ now >> that FreeBSD 8 is released and it seems that folks think it's stable, >> but I'm curious if it can provide the SUIDDIR functionality I'm >> currently using. > > =C2=A0Yes, it can. From my point of view it works the same way as on UFS. Emil- Thanks for your response... I don't know that that's quite right. SUIDDIR has to be enabled in the kernel as an option to enable the functionality on UFS and my tests on ZFS haven't proved fruitful: $ sudo zfs create zroot/shared $ sudo zfs umount zroot/shared $ sudo zfs mount -o suiddir zroot/shared $ sudo chown sats:office /zroot/shared $ sudo chmod 4770 /zroot/shared $ touch /zroot/shared/file $ ls -al /zroot/shared/ total 4 drwsrwx--- 2 sats office 3 Dec 16 23:26 ./ drwxr-xr-x 4 root wheel 9 Dec 16 23:26 ../ -rw-r--r-- 1 ben office 0 Dec 16 23:26 file With a drive mounted with the 'suiddir' option (and the kernel option enabled) the above works with UFS. I was curious if maybe there was an option in 'zfs set' that I was missing. Any clues would be appreciated. Ben From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 08:21:53 2009 Return-Path: Delivered-To: fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5D5AE1065676 for ; Thu, 17 Dec 2009 08:21:53 +0000 (UTC) (envelope-from alexander@leidinger.net) Received: from mail.ebusiness-leidinger.de (mail.ebusiness-leidinger.de [217.11.53.44]) by mx1.freebsd.org (Postfix) with ESMTP id 9B1D28FC1B for ; Thu, 17 Dec 2009 08:21:52 +0000 (UTC) Received: from outgoing.leidinger.net (pD954FBD2.dip.t-dialin.net [217.84.251.210]) by mail.ebusiness-leidinger.de (Postfix) with ESMTPSA id 1FD72884E; Thu, 17 Dec 2009 09:03:59 +0100 (CET) Received: from webmail.leidinger.net (webmail.leidinger.net [192.168.1.102]) by outgoing.leidinger.net (Postfix) with ESMTP id 9A51711EC24; Thu, 17 Dec 2009 09:03:55 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=Leidinger.net; s=outgoing-alex; t=1261037035; bh=6mjg9AXWbipvRGShCPYYImLgQ0M6UOhye7vF+wIwxUw=; h=Message-ID:Date:From:To:Cc:Subject:MIME-Version:Content-Type: Content-Transfer-Encoding; b=CQN5MyQP6fTas6dzX4GHyXM09P3/bxgtnfCmCLPkN1zy5DHYHew8HUc8cFzEy92YU 1CkBLIUqoEIlm0STjbO6hJitBp8A+k/ZoSWDcFvkwLVo894gSfcOXCREv14fmqoGxT X9rpXb2Urg6YCDImXeaLSw+DE8KCDEQhFnkN0EzhrBXNJEWKw4zXLGCpisUSX6cz/l cjx+XM/czbXVBWlsgLq7qCPhtpFBWDUWfSO5oitbA6vN+yuU26GC+esRUagEbR9Xrb +s+SxX/54QMSHNETjrKxK7r3yiasO0pUjQ9J9da2DhDKW6luzfa+4I4Ia9qsuIVNrc peycrxL5WP20w== Received: (from www@localhost) by webmail.leidinger.net (8.14.3/8.13.8/Submit) id nBH83sUY039599; Thu, 17 Dec 2009 09:03:55 +0100 (CET) (envelope-from Alexander@Leidinger.net) Received: from pslux.cec.eu.int (pslux.cec.eu.int [158.169.9.14]) by webmail.leidinger.net (Horde Framework) with HTTP; Thu, 17 Dec 2009 09:03:54 +0100 Message-ID: <20091217090354.98634ouizsftffk0@webmail.leidinger.net> X-Priority: 3 (Normal) Date: Thu, 17 Dec 2009 09:03:54 +0100 From: Alexander Leidinger To: fs@freebsd.org, stable@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.3.5) / FreeBSD-8.0 X-EBL-MailScanner-Information: Please contact the ISP for more information X-EBL-MailScanner-ID: 1FD72884E.25875 X-EBL-MailScanner: Found to be clean X-EBL-MailScanner-SpamCheck: not spam, spamhaus-ZEN, SpamAssassin (not cached, score=-0.686, required 6, autolearn=disabled, ALL_TRUSTED -1.44, DKIM_SIGNED 0.00, DKIM_VERIFIED -0.00, J_CHICKENPOX_32 0.60, TW_SK 0.08, TW_ZF 0.08) X-EBL-MailScanner-From: alexander@leidinger.net X-EBL-MailScanner-Watermark: 1261641841.04359@m+d8F2zJLCpQBCOxo5+bdw X-EBL-Spam-Status: No Cc: pjd@freebsd.org Subject: 57 ZFS patches not merged to RELENG_7 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 08:21:53 -0000 Hi, I identified at least 57 patches which are in 8-stable, but not in 7-stable. Does someone know the status of those patches (listed below)? Maybe not all are applicable, but some of them should really get merged. I also have seen some things which I think are mismerges to 7-stable, but I do not have a list/diff of them at hand (I diffed 7-stable and 8-stable and those things where "in the noise" of the stuff which is not merged yet, one of the things I find directly now is two times the same line with kmem_cache_create of the zio_cache in cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c). If those people (ok, mostly pjd) which handle ZFS stuff normally do not have time to take care about this, are there people interested in helping me merge those things? Basically this means trying to apply the patches listed below on 7-stable, and testing them (or patches from other people, in case you are not able to merge a patch yourself) on a scratch box (I do not have a scratch-box with 7-stable around). It also means reviewing the patches regarding possible issues (I already identified some which need investigation, see below). Suggestions where those things should be coordinated in this case (on fs@ or stable@ or in the FreeBSD Wiki)? Below is the list of patches which I identified. I didn't had a look which one depends on which one, but there are for sure dependencies. The format is URL of the patch in head committer which committed it - my comment about the patch commit log Bye, Alexander. http://svn.freebsd.org/viewvc/base?view=revision&revision=185310 ganbold - very easy merge ---snip--- Remove unused variable. Found with: Coverity Prevent(tm) CID: 3669,3671 ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=185319 pjd - applicable to RELENG_7? ---snip--- Fix locking (file descriptor table and Giant around VFS). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=192689 trasz - very easy merge ---snip--- Fix comment. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=193110 kmacy - easy merge ---snip--- work around snapshot shutdown race reported by Henri Hennebert ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=193128 kmacy - first probably, second and 3rd to check ---snip--- fix xdrmem_control to be safe in an if statement fix zfs to depend on krpc remove xdr from zfs makefile ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=193440 ps - shared vnode locks available in RELENG_7? vn_lock same syntax? ---snip--- Support shared vnode locks for write operations when the offset is provided on filesystems that support it. This really improves mysql + innodb performance on ZFS. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=194043 kmacy - sysctl API change? ---snip--- pjd has requested that I keep the tunable as zfs_prefetch_disable to minimize gratuitous differences with Opensolaris' ZFS ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195627 marcel - easy merge ---snip--- In nvpair_native_embedded_array(), meaningless pointers are zeroed. The programmer was aware that alignment was not guaranteed in the packed structure and used bzero() to NULL out the pointers. However, on ia64, the compiler is quite agressive in finding ILP and calls to bzero() are often replaced by simple assignments (i.e. stores). Especially when the width or size in question corresponds with a store instruction (i.e. st1, st2, st4 or st8). The problem here is not a compiler bug. The address of the memory to zero-out was given by '&packed->nvl_priv' and given the type of the 'packed' pointer the compiler could assume proper alignment for the replacement of bzero() with an 8-byte wide store to be valid. The problem is with the programmer. The programmer knew that the address did not have the alignment guarantees needed for a regular assignment, but failed to inform the compiler of that fact. In fact, the programmer told the compiler the opposite: alignment is guaranteed. The fix is to avoid using a pointer of type "nvlist_t *" and instead use a "char *" pointer as the basis for calculating the address. This tells the compiler that only 1-byte alignment can be assumed and the compiler will either keep the bzero() call or instead replace it with a sequence of byte-wise stores. Both are valid. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195785 trasz - extattr_check_cred same sytax in RELENG_7? Necessary? ---snip--- Fix permission handling for extended attributes in ZFS. Without this change, ZFS uses SunOS Alternate Data Streams semantics - each EA has its own permissions, which are set at EA creation time and - unlike SunOS - invisible to the user and impossible to change. From the user point of view, it's just broken: sometimes access is granted when it shouldn't be, sometimes it's denied when it shouldn't be. This patch makes it behave just like UFS, i.e. depend on current file permissions. Also, it fixes returned error codes (ENOATTR instead of ENOENT) and makes listextattr(2) return 0 instead of EPERM where there is no EA directory (i.e. the file never had any EA). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195822 trasz - easy merge ---snip--- Fix extattr_list_file(2) on ZFS in case the attribute directory doesn't exist and user doesn't have write access to the file. Without this fix, it returns bogus value instead of 0. For some reason this didn't manifest on my kernel compiled with -O0. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=195909 pjd - very easy merge ---snip--- We don't support ephemeral IDs in FreeBSD and without this fix ZFS can panic when in zfs_fuid_create_cred() when userid is negative. It is converted to unsigned value which makes IS_EPHEMERAL() macro to incorrectly report that this is ephemeral ID. The most reasonable solution for now is to always report that the given ID is not ephemeral. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196291 pjd - probably easy merge ---snip--- - Fix a race where /dev/zfs control device is created before ZFS is fully initialized. Also destroy /dev/zfs before doing other deinitializations. - Initialization through taskq is no longer needed and there is a race where one of the zpool/zfs command loads zfs.ko and tries to do some work immediately, but /dev/zfs is not there yet. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196269 marcel - easy merge ---snip--- Fix misalignment in nvpair_native_embedded() caused by the compiler replacing the bzero(). See also revision 195627, which fixed the misalignment in nvpair_native_embedded_array(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196295 pjd - the added stuff needs to be reviewed, taskqueue available & same syntax? ---snip--- Remove OpenSolaris taskq port (it performs very poorly in our kernel) and replace it with wrappers around our taskqueue(9). To make it possible implement taskqueue_member() function which returns 1 if the given thread was created by the given taskqueue. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196297 pjd - easy merge ---snip--- Fix panic in zfs recv code. The last vnode (mountpoint's vnode) can have 0 usecount. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196299 pjd - VI_UNLOCK same syntax in RELENG_7? ---snip--- - We need to recycle vnode instead of freeing znode. Submitted by: avg - Add missing vnode interlock unlock. - Remove redundant znode locking. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196301 pjd - probably easy merge ---snip--- If z_buf is NULL, we should free znode immediately. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196307 pjd - to be reviewed ---snip--- Manage asynchronous vnode release just like Solaris. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196309 pjd - vhold/VN_RELE/zfsctl_root_lookup/vop_vptocnp same syntax in RELENG_7? ---snip--- getcwd() (when __getcwd() fails) works by stating current directory, going up (..), calling readdir and looking for previous directory inode. In case of .zfs/ directory this doesn't work, because .zfs/ is hidden by default, so it won't be visible in readdir output. Fix this by implementing VPTOCNP for snapshot directories, so __getcwd() doesn't fail and getcwd() doesn't have to use readdir method. This fixes /bin/pwd from within .zfs/snapshot//. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196395 pjd - applicable to RELENG_7 (same XDR code?)? ---snip--- Our libc doesn't implement control method for XDR (only kernel does) and it will always return failure. Fix this by bringing userland implementation of xdrmem_control() back. This allow 'zpool import' to work again. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196456 pjd - kproc_create the same syntax on RELENG_7? ---snip--- - Give minclsyspri and maxclsyspri real values (consulted with kmacy). - Honour 'pri' argument for thread_create(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196457 pjd - easy merge, probably depends upon 196457 ---snip--- Set priority of vdev_geom threads and zvol threads to PRIBIO. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196458 pjd - kproc_kthread_add available in RELENG_7? same syntax? ---snip--- - Hide ZFS kernel threads under zfskern process. - Use better (shorter) threads names: 'zvol:worker zvol/tank/vol00' -> 'zvol tank/vol00' 'vdev:worker da0' -> 'vdev da0' ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196662 pjd - vn_lock/VOP_UNLOCK syntax the same in RELENG_7 (add curthread?)? ---snip--- Add missing mountpoint vnode locking. This fixes panic on assertion with DEBUG_VFS_LOCKS and vfs.usermount=1 when regular user tries to mount dataset owned by him. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196703 pjd - probably easy merge, KBI change (dnode)? ---snip--- Backport the 'dirtying dbuf' panic fix from newer ZFS version. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196919 pjd - very easy merge ---snip--- bzero() on-stack argument, so mutex_init() won't misinterpret that the lock is already initialized if we have some garbage on the stack. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196927 pjd - easy merge ---snip--- Changing provider size is not really supported by GEOM, but doing so when provider is closed should be ok. When administrator requests to change ZVOL size do it immediately if ZVOL is closed or do it on last ZVOL close. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196943 pjd - MNT_* same syntax in RELENG_7? ---snip--- - Avoid holding mutex around M_WAITOK allocations. - Add locking for mnt_opt field. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196944 pjd - easy merge ---snip--- Don't recheck ownership on update mount. This will eliminate LOR between vfs_busy() and mount mutex. We check ownership in vfs_domount() anyway. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196954 pjd - easy merge ---snip--- If we have to use avl_find(), optimize a bit and use avl_insert() instead of avl_add() (the latter is actually a wrapper around avl_find() + avl_insert()). Fix similar case in the code that is currently commented out. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196965 pjd - easy merge ---snip--- Fix reference count leak for a case where snapshot's mount point is updated. Such situation is not supported. This problem was triggered by something like this: # zpool create tank da0 # zfs snapshot tank@snap # cd /tank/.zfs/snapshot/snap (this will mount the snapshot) # cd # mount -u nosuid /tank/.zfs/snapshot/snap (refcount leak) # zpool export tank cannot export 'tank': pool is busy ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196980 pjd - VFS_ROOT/VN_RELE same syntax in RELENG_7? ---snip--- When we automatically mount snapshot we want to return vnode of the mount point from the lookup and not covered vnode. This is one of the fixes for using .zfs/ over NFS. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196982 pjd - vfs_checkexp/vfs_stdcheckexp same syntax in RELENG_7? ---snip--- We don't export individual snapshots, so mnt_export field in snapshot's mount point is NULL. That's why when we try to access snapshots over NFS use mnt_export field from the parent file system. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=196985 pjd - easy merge ---snip--- Only log successful commands! Without this fix we log even unsuccessful commands executed by unprivileged users. Action is not really taken, but it is logged to pool history, which might be confusing. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197133 pjd - do we have rw-locks in RELENG_7? ---snip--- - Protect reclaim with z_teardown_inactive_lock. - Be prepared for dbuf to disappear in zfs_reclaim_complete() and check if z_dbuf field is NULL - this might happen in case of rollback or forced unmount between zfs_freebsd_reclaim() and zfs_reclaim_complete(). - On forced unmount wait for all znodes to be destroyed - destruction can be done asynchronously via zfs_reclaim_complete(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197151 pjd - easy merge ---snip--- Be sure not to overflow struct fid. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197152 pjd - easy merge ---snip--- Extend scope of the z_teardown_lock lock for consistency and "just in case". ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197153 pjd - easy merge ---snip--- When zfs.ko is compiled with debug, make sure that znode and vnode point at each other. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197167 pjd - easy merge ---snip--- Work-around READDIRPLUS problem with .zfs/ and .zfs/snapshot/ directories by just returning EOPNOTSUPP. This will allow NFS server to fall back to regular READDIR. Note that converting inode number to snapshot's vnode is expensive operation. Snapshots are stored in AVL tree, but based on their names, not inode numbers, so to convert inode to snapshot vnode we have to interate over all snalshots. This is not a problem in OpenSolaris, because in their READDIRPLUS implementation they use VOP_LOOKUP() on d_name, instead of VFS_VGET() on d_fileno as we do. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197177 pjd - easy merge ---snip--- Support both case: when snapshot is already mounted and when it is not yet mounted. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197201 pjd - VOP_UNLOCK/VI_LOCK/VI_UNLOCK same syntax in RELENG_7 (add curthread?)? ---snip--- - Mount ZFS snapshots with MNT_IGNORE flag, so they are not visible in regular df(1) and mount(8) output. This is a bit smilar to OpenSolaris and follows ZFS route of not listing snapshots by default with 'zfs list' command. - Add UPDATING entry to note that ZFS snapshots are no longer visible in mount(8) and df(1) output by default. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197351 pjd - easy merge ---snip--- Purge namecache in the same place OpenSolaris does. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197458 pjd - probably easy merge ---snip--- Close race in zfs_zget(). We have to increase usecount first and then check for VI_DOOMED flag. Before this change vnode could be reclaimed between checking for the flag and increasing usecount. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197459 pjd - probably easy merge ---snip--- Before calling vflush(FORCECLOSE) mark file system as unmounted so the following vnops will fail. This is very important, because without this change vnode could be reclaimed at any point, even if we increased usecount. The only way to ensure that vnode won't be reclaimed was to lock it, which would be very hard to do in ZFS without changing a lot of code. With this change simply increasing usecount is enough to be sure vnode won't be reclaimed from under us. To be precise it can still be reclaimed but we won't be able to see it, because every try to enter ZFS through VFS will result in EIO. The only function that cannot return EIO, because it is needed for vflush() is zfs_root(). Introduce ZFS_ENTER_NOERROR() macro that only locks z_teardown_lock and never returns EIO. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197497 pjd - easy merge, implications for existing pools/data? ---snip--- Switch to fletcher4 as the default checksum algorithm. Fletcher2 was proven to be a bit weak and OpenSolaris also switched to fletcher4. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197512 pjd - VI_UNLOCK same syntax in RELENG_8? ---snip--- - Don't depend on value returned by gfs_*_inactive(), it doesn't work well with forced unmounts when GFS vnodes are referenced. - Make other preparations to GFS for forced unmounts. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197513 pjd - traverse/VN_RELE same syntax in RELENG_7? ---snip--- Use traverse() function to find and return mount point's vnode instead of covered vnode when snapshot is already mounted. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197515 pjd - probably easy merge ---snip--- Handle cases where virtual (GFS) vnodes are referenced when doing forced unmount. In that case we cannot depend on the proper order of invalidating vnodes, so we have to free resources when we have a chance. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197683 delphij - easy merge ---snip--- Return EOPNOTSUPP instead of EINVAL when doing chflags(2) over an old format ZFS, as defined in the manual page. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197831 pjd - does the added variable cause a KBI change? ---snip--- Fix situation where Mac OS X NFS client creates a file and when it tries to set ownership and mode in the same setattr operation, the mode was overwritten by secpolicy_vnode_setattr(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=197861 pjd - priv_check available / VOP_ACCESS same syntax in RELENG_7? ---snip--- Allow file system owner to modify system flags if securelevel permits. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=198703 pjd - vaccess same syntax on RELENG_7? ---snip--- - zfs_zaccess() can handle VAPPEND too, so map V_APPEND to VAPPEND and call zfs_access() instead of vaccess() in this case as well. - If VADMIN is specified with another V* flag (unlikely) call both zfs_access() and vaccess() after spliting V* flags. This fixes "dirtying snapshot!" panic. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=199156 pjd - probably easy merge (struct change, KBI implications?) ---snip--- Avoid passing invalid mountpoint to getnewvnode(). ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=199157 pjd - important, maybe security, probably easy merge ---snip--- Be careful which vattr fields are set during setattr replay. Without this fix strange things can appear after unclean shutdown like files with mode set to 07777. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=200124 pjd - very easy merge ---snip--- Avoid using additional variable for storing an error if we are not going to do anything with it. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=200126 pjd - easy merge, prevents ZFS pools on ZVOLs ---snip--- Fix deadlock when ZVOLs are present and we are replacing dead component or calling scrub when pool is in a degraded state. It will try to taste ZVOLs, which will lead to deadlock, as ZVOL will try to acquire the same locks as replace/scrub is holding already. We can't simply skip provider based on their GEOM class, because ZVOL can have providers build on top of it and we need to skip those as well. We do it by asking for ZFS::iszvol attribute. Any ZVOL-based provider will give us positive answer and we have to skip those providers. This way we remove possibility to create ZFS pools on top of ZVOLs, but it is not very useful anyway. I believe deadlock is still possible in some very complex situations like when we have MD provider on top of UFS file on top of ZVOL. When we try to replace dead component in the pool mentioned ZVOL is based on, there might be a deadlock when ZFS will try to taste MD provider. There is no easy way to detect that, but it isn't very common. ---snip--- http://svn.freebsd.org/viewvc/base?view=revision&revision=200158 pjd - easy merge ---snip--- We have to eventually look for provider without checking guid as this is need for attaching when there is no metadata yet. Before r200125 the order of looking for providers was wrong. It was: 1. Find provider by name. 2. Find provider by guid. 3. Find provider by name and guid. Where it should have been: 1. Find provider by name and guid. 2. Find provider by guid. 3. Find provider by name. ---snip--- -- http://www.Leidinger.net Alexander @ Leidinger.net: PGP ID = B0063FE7 http://www.FreeBSD.org netchild @ FreeBSD.org : PGP ID = 72077137 From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 16:57:24 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 537331065672; Thu, 17 Dec 2009 16:57:24 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 2A76B8FC08; Thu, 17 Dec 2009 16:57:24 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBHGvOut069494; Thu, 17 Dec 2009 16:57:24 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBHGvOwX069490; Thu, 17 Dec 2009 16:57:24 GMT (envelope-from linimon) Date: Thu, 17 Dec 2009 16:57:24 GMT Message-Id: <200912171657.nBHGvOwX069490@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/141718: [zfs] [panic] kernel panic when 'zfs rename' is used on mounted snapshot X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 16:57:24 -0000 Old Synopsis: kernel panic when 'zfs rename' is used on mounted snapshot New Synopsis: [zfs] [panic] kernel panic when 'zfs rename' is used on mounted snapshot Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Thu Dec 17 16:57:01 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=141718 From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 17:38:55 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 46C20106568D; Thu, 17 Dec 2009 17:38:55 +0000 (UTC) (envelope-from sarawgi.aditya@gmail.com) Received: from mail-pz0-f185.google.com (mail-pz0-f185.google.com [209.85.222.185]) by mx1.freebsd.org (Postfix) with ESMTP id 0597B8FC0C; Thu, 17 Dec 2009 17:38:54 +0000 (UTC) Received: by pzk15 with SMTP id 15so1557235pzk.3 for ; Thu, 17 Dec 2009 09:38:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:date:from:to:cc:subject :message-id:references:mime-version:content-type:content-disposition :in-reply-to:user-agent; bh=QETOKo4JbU4vkpbThfGi0GgVUh7wURW8y4dvMw4VxL0=; b=hhZgyDlYyMjJsXuLaXNGJozbCcSrZBAexJeptQExDMSpqJTUtHKzCn2EuaEosn6Am5 05fDF0U8FUrUgVgQNLG33yn5BfOrIu7QHwiZ+wFxGepfEA0va+ROpwF11TJc/hMumjvl Wo7Px3BBUupCwdq/Y2TyeA/M8xfyutzabq8mM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=Ivll3ovTFrCs3hjeSuEXsYAB6V9Rjx07ci6V4EKQZm6ar+SXCXTnOIFSXaBIrj0jkj CWiqtZWyoIrRT3U835x0qDAi7nbe3y3kMsyNDD1ZXwAuUimTlML1QNV8r3a1tZnLyfUI XUBefEnVkAv6p/XtyhaTpgU4dsMFSbEvEHcKA= Received: by 10.141.4.8 with SMTP id g8mr1964028rvi.163.1261071534322; Thu, 17 Dec 2009 09:38:54 -0800 (PST) Received: from ([183.87.28.193]) by mx.google.com with ESMTPS id 23sm1683657pzk.12.2009.12.17.09.38.52 (version=TLSv1/SSLv3 cipher=RC4-MD5); Thu, 17 Dec 2009 09:38:53 -0800 (PST) Date: Thu, 17 Dec 2009 18:09:44 +0000 From: Aditya Sarawgi To: Mikle Krutov Message-ID: <4b2a6cad.9713f30a.573f.ffffad80@mx.google.com> References: <20091209203213.GB2281@aditya> <884554e60912090842x1bf1e8e4u842cbce4647aa63@mail.gmail.com> <20091209225000.GD2281@aditya> <884554e60912101333j1698e4c4o5d5903eb9c97211c@mail.gmail.com> <884554e60912101351o2d592c6fn99f64bb2fd8c33be@mail.gmail.com> <4b21a7b9.5744f10a.3378.6447@mx.google.com> <884554e60912110938n79439fc1p4a3d41f81bb88991@mail.gmail.com> <5da0588e0912112203v52653fb2ic71d97970b615547@mail.gmail.com> <884554e60912120149t6cd0df5me349b1fdf485b0c@mail.gmail.com> <884554e60912120547i1ccca7acob8714fd5121a5508@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <884554e60912120547i1ccca7acob8714fd5121a5508@mail.gmail.com> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: jh@freebsd.org, freebsd-fs@freebsd.org Subject: Re: [8.0-RELEASE] ext2fs mount fails X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 17:38:55 -0000 On Sat, Dec 12, 2009 at 04:47:02PM +0300, Mikle Krutov wrote: > Yes, it's 100% ext2fs. > I could not create dump with fbsd - 'wrong magic number', so i used > linux livecd. > Mikle, file a pr. jh@, stas@ any ideas about this. > 2009/12/12 Mikle Krutov : > > Yes, it's 100% ext2fs. > > I could not create dumps with fbsd - 'wrong magic number', so i used > > linux livecd. > > > > > > > > 2009/12/12 Rich : > >> Are you sure it's ext2/3, and not ext4? > >> > >> ext4 isn't mountable as ext2 if you have ever had certain [default] > >> flags on in Linux [the extents feature, in particular, breaks backward > >> compatibility]. > >> > >> dumpe2fs with a list of the filesystem's feature flags is relevant here. > >> > >> - Rich > >> > >> On Fri, Dec 11, 2009 at 12:38 PM, Mikle Krutov wrote: > >>> I do not understand anything, because after recompiling > >>> a) only ilbstand > >>> b) libstand + mount > >>> c) whole kernel+world > >>> mount -t ext2fs /dev/ad8p1 still returns the same message. I've grep'd > >>> '0xef53' in /usr/src & /usr/share - only 2 files in /usr/src, that > >>> i've already changed already before recompile. > >>> > >>> Also, i'm really sorry for writing only private to Aditya's mail, > >>> didn't mention that gmail has changed the recipient address. > >>> > >>> The previous mail summary: > >>> 1) i've already tried to update src (still with RELENG_8) && rebuild the kernel > >>> 2) (just done) have tried to edit files in /usr/src where i've grep'd > >>> '0xef53' - somehow it didn't help. > >>> _______________________________________________ > >>> freebsd-fs@freebsd.org mailing list > >>> http://lists.freebsd.org/mailman/listinfo/freebsd-fs > >>> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > >>> > >> > >> > >> > >> -- > >> > >> La??o bacana para panaca bo??al. -- pal??ndromo > >> > > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" -- Aditya Sarawgi From owner-freebsd-fs@FreeBSD.ORG Thu Dec 17 18:21:16 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2FD02106568B; Thu, 17 Dec 2009 18:21:16 +0000 (UTC) (envelope-from jh@FreeBSD.org) Received: from gw01.mail.saunalahti.fi (gw01.mail.saunalahti.fi [195.197.172.115]) by mx1.freebsd.org (Postfix) with ESMTP id DD8728FC0C; Thu, 17 Dec 2009 18:21:15 +0000 (UTC) Received: from a91-153-117-195.elisa-laajakaista.fi (a91-153-117-195.elisa-laajakaista.fi [91.153.117.195]) by gw01.mail.saunalahti.fi (Postfix) with SMTP id 29F3E151545; Thu, 17 Dec 2009 20:04:46 +0200 (EET) Date: Thu, 17 Dec 2009 20:04:45 +0200 From: Jaakko Heinonen To: Aditya Sarawgi Message-ID: <20091217180445.GA4115@a91-153-117-195.elisa-laajakaista.fi> References: <884554e60912090842x1bf1e8e4u842cbce4647aa63@mail.gmail.com> <20091209225000.GD2281@aditya> <884554e60912101333j1698e4c4o5d5903eb9c97211c@mail.gmail.com> <884554e60912101351o2d592c6fn99f64bb2fd8c33be@mail.gmail.com> <4b21a7b9.5744f10a.3378.6447@mx.google.com> <884554e60912110938n79439fc1p4a3d41f81bb88991@mail.gmail.com> <5da0588e0912112203v52653fb2ic71d97970b615547@mail.gmail.com> <884554e60912120149t6cd0df5me349b1fdf485b0c@mail.gmail.com> <884554e60912120547i1ccca7acob8714fd5121a5508@mail.gmail.com> <4b2a6cad.9713f30a.573f.ffffad80@mx.google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4b2a6cad.9713f30a.573f.ffffad80@mx.google.com> User-Agent: Mutt/1.5.20 (2009-06-14) Cc: freebsd-fs@freebsd.org Subject: Re: [8.0-RELEASE] ext2fs mount fails X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Dec 2009 18:21:16 -0000 On 2009-12-17, Aditya Sarawgi wrote: > > Yes, it's 100% ext2fs. I could not create dump with fbsd - 'wrong > > magic number', so i used linux livecd. > Mikle, file a pr. jh@, stas@ any ideas about this. How did you verify that /dev/ad8p1 contains a valid ext2fs (on FreeBSD)? AFAIK, recent dumpe2fs should work in any case if the partition contains a valid file system. My guess is that there is something wrong with the partition or how FreeBSD sees the partition. "gpart show" output might be useful for starters and maybe similar output from Linux. -- Jaakko From owner-freebsd-fs@FreeBSD.ORG Fri Dec 18 08:38:35 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7BBEC106568B; Fri, 18 Dec 2009 08:38:35 +0000 (UTC) (envelope-from nekoexmachina@gmail.com) Received: from mail-bw0-f213.google.com (mail-bw0-f213.google.com [209.85.218.213]) by mx1.freebsd.org (Postfix) with ESMTP id 9D2438FC12; Fri, 18 Dec 2009 08:38:34 +0000 (UTC) Received: by bwz5 with SMTP id 5so1974543bwz.3 for ; Fri, 18 Dec 2009 00:38:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=oZEEcLDJvMHjgdgDav9FrDCMLULt1U3hSpr4qM/3BEM=; b=l0qINGi6+ixoEwC49eDUcug6X4qJFAwSR6JWdbOSgefSDpRpzdFDKUlFJsiBXQQTHD EMYzs95XWJYGiHmPesyFb5NGAYYiovZMwpGO2tjSviGhKQ3t3DvlBj7R9Hc1EAto895I 1/Zsuzy8Y875/vZy2LurjinF2AwZ903x1BsIs= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=jv2rn4dzKRV9ocBSv+t4UHI3IH0npHboaOJhgBOei9pJEpUI4sSk6SsYi6KJEFgYgL IwitL8cXWywXcbA7/kXvYnOTjVhqyWdEDQLeOlsG0Rv70FiE1s6tuIDw11AwB2SaxqSv /s/7xwyOi/XxTVEb3xH+ddrapQ0sXCdkFgY98= MIME-Version: 1.0 Received: by 10.204.154.142 with SMTP id o14mr2190472bkw.125.1261125513662; Fri, 18 Dec 2009 00:38:33 -0800 (PST) In-Reply-To: <20091217180445.GA4115@a91-153-117-195.elisa-laajakaista.fi> References: <884554e60912090842x1bf1e8e4u842cbce4647aa63@mail.gmail.com> <884554e60912101333j1698e4c4o5d5903eb9c97211c@mail.gmail.com> <884554e60912101351o2d592c6fn99f64bb2fd8c33be@mail.gmail.com> <4b21a7b9.5744f10a.3378.6447@mx.google.com> <884554e60912110938n79439fc1p4a3d41f81bb88991@mail.gmail.com> <5da0588e0912112203v52653fb2ic71d97970b615547@mail.gmail.com> <884554e60912120149t6cd0df5me349b1fdf485b0c@mail.gmail.com> <884554e60912120547i1ccca7acob8714fd5121a5508@mail.gmail.com> <4b2a6cad.9713f30a.573f.ffffad80@mx.google.com> <20091217180445.GA4115@a91-153-117-195.elisa-laajakaista.fi> Date: Fri, 18 Dec 2009 11:38:33 +0300 Message-ID: <884554e60912180038w3442dc67of9a1ed755f399df0@mail.gmail.com> From: Mikle Krutov To: freebsd-fs@freebsd.org, jh@freebsd.org, stas@freebsd.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: Subject: Re: [8.0-RELEASE] ext2fs mount fails X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2009 08:38:35 -0000 Well, i could not verify that under freebsd. Every fs-tool (except testdisk - tried to use it to restore superblock after message 'wrong magic number' appeared first time) tells me that i've got wrong magic number for this filesystem. I can not give the gpart show output right now; will send it in couple of d= ays. 2009/12/17 Jaakko Heinonen : > On 2009-12-17, Aditya Sarawgi wrote: >> > Yes, it's 100% ext2fs. =C2=A0I could not create dump with fbsd - 'wron= g >> > magic number', so i used linux livecd. > >> Mikle, file a pr. jh@, stas@ any ideas about this. > > How did you verify that /dev/ad8p1 contains a valid ext2fs (on FreeBSD)? > AFAIK, recent dumpe2fs should work in any case if the partition contains > a valid file system. My guess is that there is something wrong with the > partition or how FreeBSD sees the partition. > > "gpart show" output might be useful for starters and maybe similar > output from Linux. > > -- > Jaakko > From owner-freebsd-fs@FreeBSD.ORG Fri Dec 18 15:28:04 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9AA691065676 for ; Fri, 18 Dec 2009 15:28:04 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id 3A5AE8FC15 for ; Fri, 18 Dec 2009 15:28:03 +0000 (UTC) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id nBIFS2FR088795; Fri, 18 Dec 2009 09:28:02 -0600 (CST) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:cc:subject: references:in-reply-to:content-type:content-transfer-encoding; b=oGOYt5wTMadBqyEGdWvsV3IInRLrgNtNM/BYxKqF8s7dImarFQi9bh24wjePwti0T uOzLWiHQa2m4G4IVeRvvyU4cN0PQ7G7T+MdiXKffJeS9tQ+iIZ9yMF1o49wMJ0PfCYD sw/6z7ArbRIZgK/Djh9qrcCKJQu0itwE2/rGmD0= Message-ID: <4B2B9F82.4020909@jrv.org> Date: Fri, 18 Dec 2009 09:28:02 -0600 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.23 (Macintosh/20090812) MIME-Version: 1.0 References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> <4B299CEA.3070705@jrv.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2009 15:28:04 -0000 Thomas Burgess wrote: > One thing most people don't know about hard drives in general is that > sometimes up to 30% of the space is actually ECC. With software raid > systems like ZFS, this will eventually be somethign that we can take > advantage of. ECC is less than 10% of the space. The inter-sector gap and gap between a sector's address and data fields, etc, are larger and more problematic as rotation speeds increase. > Because of this, you can imagine a scenario where allowing ZFS to > use this ECC space as raw storage, while leaving the data corrections > to ZFS would be ideal. It's not only a matter of space, it will also > lead to nice improvements in speed. (more data can be read/written by > the head as it passes) The disk drive industry's solution to this is 4K sector sizes. See http://www.anandtech.com/storage/showdoc.aspx?i=3691 Even ZFS would need major changes to use drives without ECC without an increased hard error rate. I don't see this happening since no filesystems exist yet for this environment, and since transitions to new filesystems are so slow (99.9%+ of systems today are running filesystems architectures at least two decades old). From owner-freebsd-fs@FreeBSD.ORG Fri Dec 18 15:42:43 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2A454106568B for ; Fri, 18 Dec 2009 15:42:43 +0000 (UTC) (envelope-from wonslung@gmail.com) Received: from mail-ew0-f226.google.com (mail-ew0-f226.google.com [209.85.219.226]) by mx1.freebsd.org (Postfix) with ESMTP id A92B58FC1B for ; Fri, 18 Dec 2009 15:42:42 +0000 (UTC) Received: by ewy26 with SMTP id 26so2507817ewy.3 for ; Fri, 18 Dec 2009 07:42:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:cc:content-type; bh=aAC0maFZZ1lu26BFKQoc529VsQvEMExyx72pDPpcNIk=; b=c7plhl2UM3SUR4lGSiv7Ht708Vjg+50zPTq+HYzbJdGRe/waEHRxbsHna4/IaLQRHl bNLPFawb4ZgSMzz97/nM7OD5/jsE6GvA7w+o3ErE1HWNXO8Vsm/s77B4JCX2oNKfOIr1 D2Be3WZkFjBesC65h5hE1uxvA060pSwr+H7qE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=KsTx9PbRj5RmVAMEQ/3pdtGdXu/TStGffl+753A2ThadL2JM0/gpCW/TPVVb+Zph6p P+Dnx676eSTn//wcZ5k9KFv2UH+zzUidpSZwugPg2fEPJRuiz/f49EI21kYrr2deDxeq CTjzQt6+poPXHKpywgCo6br3aGiWAyNiIcROM= MIME-Version: 1.0 Received: by 10.216.88.212 with SMTP id a62mr1388284wef.72.1261150961384; Fri, 18 Dec 2009 07:42:41 -0800 (PST) In-Reply-To: <4B2B9F82.4020909@jrv.org> References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> <4B299CEA.3070705@jrv.org> <4B2B9F82.4020909@jrv.org> Date: Fri, 18 Dec 2009 10:42:41 -0500 Message-ID: From: Thomas Burgess To: "James R. Van Artsdalen" Content-Type: text/plain; charset=ISO-8859-1 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2009 15:42:43 -0000 On Fri, Dec 18, 2009 at 10:28 AM, James R. Van Artsdalen < james-freebsd-fs2@jrv.org> wrote: > Thomas Burgess wrote: > > One thing most people don't know about hard drives in general is that > > sometimes up to 30% of the space is actually ECC. With software raid > > systems like ZFS, this will eventually be somethign that we can take > > advantage of. > > i was basing this information on a talk Jeff Bonwick gave. Google JeffBonwick_*zfs*-What_*Next*-SDC09.pdfand it should show the information i'm talking about. > ECC is less than 10% of the space. The inter-sector gap and gap between > a sector's address and data fields, etc, are larger and more problematic > as rotation speeds increase. > > > Because of this, you can imagine a scenario where allowing ZFS to > > use this ECC space as raw storage, while leaving the data corrections > > to ZFS would be ideal. It's not only a matter of space, it will also > > lead to nice improvements in speed. (more data can be read/written by > > the head as it passes) > > The disk drive industry's solution to this is 4K sector sizes. See > http://www.anandtech.com/storage/showdoc.aspx?i=3691 > > Even ZFS would need major changes to use drives without ECC without an > increased hard error rate. I don't see this happening since no > filesystems exist yet for this environment, and since transitions to new > filesystems are so slow (99.9%+ of systems today are running filesystems > architectures at least two decades old). > again, i got my information from the lead zfs developer. I also spent a lot of time on google reading up on this after hearing about it because i found it to be so interesting. I am a layman though, so perhaps i'm wrong. From owner-freebsd-fs@FreeBSD.ORG Fri Dec 18 19:10:51 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 224BA106566C for ; Fri, 18 Dec 2009 19:10:51 +0000 (UTC) (envelope-from ticso@cicely7.cicely.de) Received: from raven.bwct.de (raven.bwct.de [85.159.14.73]) by mx1.freebsd.org (Postfix) with ESMTP id A3A4B8FC08 for ; Fri, 18 Dec 2009 19:10:50 +0000 (UTC) Received: from cicely5.cicely.de ([10.1.1.7]) by raven.bwct.de (8.13.4/8.13.4) with ESMTP id nBIItYZG071615 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Fri, 18 Dec 2009 19:55:34 +0100 (CET) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (cicely7.cicely.de [10.1.1.9]) by cicely5.cicely.de (8.14.2/8.14.2) with ESMTP id nBIItVoa016397 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 18 Dec 2009 19:55:31 +0100 (CET) (envelope-from ticso@cicely7.cicely.de) Received: from cicely7.cicely.de (localhost [127.0.0.1]) by cicely7.cicely.de (8.14.2/8.14.2) with ESMTP id nBIItU8h002929; Fri, 18 Dec 2009 19:55:30 +0100 (CET) (envelope-from ticso@cicely7.cicely.de) Received: (from ticso@localhost) by cicely7.cicely.de (8.14.2/8.14.2/Submit) id nBIItUGD002928; Fri, 18 Dec 2009 19:55:30 +0100 (CET) (envelope-from ticso) Date: Fri, 18 Dec 2009 19:55:30 +0100 From: Bernd Walter To: "James R. Van Artsdalen" Message-ID: <20091218185529.GC1531@cicely7.cicely.de> References: <568624531.20091215163420@pyro.de> <42952D86-6B4D-49A3-8E4F-7A1A53A954C2@spry.com> <957649379.20091216005253@pyro.de> <26F8D203-A923-47D3-9935-BE4BC6DA09B7@corp.spry.com> <4B299CEA.3070705@jrv.org> <4B2B9F82.4020909@jrv.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4B2B9F82.4020909@jrv.org> X-Operating-System: FreeBSD cicely7.cicely.de 7.0-STABLE i386 User-Agent: Mutt/1.5.11 X-Spam-Status: No, score=-4.4 required=5.0 tests=ALL_TRUSTED=-1.8, AWL=0.019, BAYES_00=-2.599 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on spamd.cicely.de Cc: freebsd-fs Subject: Re: ZFS RaidZ2 with 24 drives? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: ticso@cicely.de List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2009 19:10:51 -0000 On Fri, Dec 18, 2009 at 09:28:02AM -0600, James R. Van Artsdalen wrote: > Thomas Burgess wrote: > > One thing most people don't know about hard drives in general is that > > sometimes up to 30% of the space is actually ECC. With software raid > > systems like ZFS, this will eventually be somethign that we can take > > advantage of. > > ECC is less than 10% of the space. The inter-sector gap and gap between > a sector's address and data fields, etc, are larger and more problematic > as rotation speeds increase. > > > Because of this, you can imagine a scenario where allowing ZFS to > > use this ECC space as raw storage, while leaving the data corrections > > to ZFS would be ideal. It's not only a matter of space, it will also > > lead to nice improvements in speed. (more data can be read/written by > > the head as it passes) I can imagine this might work, but do not see it to be very realistic. And I'm not sure this is a very good idea as well. > The disk drive industry's solution to this is 4K sector sizes. See > http://www.anandtech.com/storage/showdoc.aspx?i=3691 This is an quite interesting article, but for me not very surprising. It is a bit more surprising that this is not already standard. The whole thing is not new - the Commodore 1581 floppy drive did a track at once logic with caching and supplied logical 256 Byte sectors with underlying 512 MFM blocks. With flash cards 4k physical sectors are also very common and writing smaller/unaligned transfers leads to slow read-modify-write cycles. You need to be very carefull when partitioning a flash based memory card - a lot of us are already using flash based as an alternative to an HDD. People also used MO media with 1k and 2k hard sectoring - they usually don't have an emulation for 512 Bytes. -- B.Walter http://www.bwct.de Modbus/TCP Ethernet I/O Baugruppen, ARM basierte FreeBSD Rechner uvm. From owner-freebsd-fs@FreeBSD.ORG Fri Dec 18 19:58:06 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFA0B1065676; Fri, 18 Dec 2009 19:58:06 +0000 (UTC) (envelope-from jhb@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 9680F8FC19; Fri, 18 Dec 2009 19:58:06 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBIJw61w079544; Fri, 18 Dec 2009 19:58:06 GMT (envelope-from jhb@freefall.freebsd.org) Received: (from jhb@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBIJw6ts079540; Fri, 18 Dec 2009 19:58:06 GMT (envelope-from jhb) Date: Fri, 18 Dec 2009 19:58:06 GMT Message-Id: <200912181958.nBIJw6ts079540@freefall.freebsd.org> To: faber@isi.edu, jhb@FreeBSD.org, freebsd-fs@FreeBSD.org From: jhb@FreeBSD.org Cc: Subject: Re: kern/140853: [nfs] [patch] NFSv2 remove calls fail to send error replies (memory leak!) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 18 Dec 2009 19:58:06 -0000 Synopsis: [nfs] [patch] NFSv2 remove calls fail to send error replies (memory leak!) State-Changed-From-To: open->closed State-Changed-By: jhb State-Changed-When: Fri Dec 18 19:57:23 UTC 2009 State-Changed-Why: Fix applied to 6.x and later, thanks! http://www.freebsd.org/cgi/query-pr.cgi?pr=140853 From owner-freebsd-fs@FreeBSD.ORG Sat Dec 19 12:00:05 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 920D6106566B; Sat, 19 Dec 2009 12:00:05 +0000 (UTC) (envelope-from delphij@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 56F5D8FC12; Sat, 19 Dec 2009 12:00:05 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id nBJC05lj002466; Sat, 19 Dec 2009 12:00:05 GMT (envelope-from delphij@freefall.freebsd.org) Received: (from delphij@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id nBJC05f1002462; Sat, 19 Dec 2009 12:00:05 GMT (envelope-from delphij) Date: Sat, 19 Dec 2009 12:00:05 GMT Message-Id: <200912191200.nBJC05f1002462@freefall.freebsd.org> To: mm@FreeBSD.org, delphij@FreeBSD.org, freebsd-fs@FreeBSD.org, delphij@FreeBSD.org From: delphij@FreeBSD.org Cc: Subject: Re: kern/141387: [zfs] [patch] zfs snapshot -r failed because filesystem was busy X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Dec 2009 12:00:05 -0000 Synopsis: [zfs] [patch] zfs snapshot -r failed because filesystem was busy State-Changed-From-To: open->patched State-Changed-By: delphij State-Changed-When: Sat Dec 19 11:58:46 UTC 2009 State-Changed-Why: A patch has been applied as revision 200727. Responsible-Changed-From-To: freebsd-fs->delphij Responsible-Changed-By: delphij Responsible-Changed-When: Sat Dec 19 11:58:46 UTC 2009 Responsible-Changed-Why: Grab. http://www.freebsd.org/cgi/query-pr.cgi?pr=141387 From owner-freebsd-fs@FreeBSD.ORG Sat Dec 19 19:09:05 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 92CFB1065676 for ; Sat, 19 Dec 2009 19:09:05 +0000 (UTC) (envelope-from bfriesen@simple.dallas.tx.us) Received: from blade.simplesystems.org (blade.simplesystems.org [65.66.246.74]) by mx1.freebsd.org (Postfix) with ESMTP id 325D68FC13 for ; Sat, 19 Dec 2009 19:09:04 +0000 (UTC) Received: from freddy.simplesystems.org (freddy.simplesystems.org [65.66.246.65]) by blade.simplesystems.org (8.13.8+Sun/8.13.8) with ESMTP id nBJIlm7m028023 for ; Sat, 19 Dec 2009 12:47:48 -0600 (CST) Date: Sat, 19 Dec 2009 12:47:48 -0600 (CST) From: Bob Friesenhahn X-X-Sender: bfriesen@freddy.simplesystems.org To: freebsd-fs@freebsd.org Message-ID: User-Agent: Alpine 2.01 (GSO 1266 2009-07-14) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; format=flowed; charset=US-ASCII X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.2.2 (blade.simplesystems.org [65.66.246.90]); Sat, 19 Dec 2009 12:47:48 -0600 (CST) Subject: am-utils/NFS mount lockups in 8.0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 19 Dec 2009 19:09:05 -0000 After upgrading my FreeBSD system from FreeBSD 7.2 to 8.0, the am-utils automounter is experiencing difficulty with managing the NFS client mounts to my Solaris 10U8 system. There have never been any difficulties before. This is when using the NFSv3/TCP in the default kernel and not the new NFSv4 implementation. If it matters, this is for NFS exports from a ZFS pool, with an exported filesystem per user. The problem I see is that the initial mount is instantaneous and works great. After the mount times out, the re-mount produces several "NFS timeout" messages to the widow where the accessing program is running. Sometimes this remount succeeds, but if it fails, then the program is left locked up forever waiting for the NFS mount. Meanwhile connectivity between the FreeBSD system and the Solaris system is fine, as illustrated by excellent connectivity with SSH and no lost packets via 'ping'. The amd log file (/var/log/amd.log) shows no sign of any difficulties. The only complaint is due to an occasional attempt to mount /home/svn even though there is no 'svn' user in /etc/passwd. Perhaps this is due to some action of portupgrade since during this time all of the ports are being rebuilt. It is indeed quite curious that the last entry I see in amd.log (prior to entries caused by me killing the amd daemon) is an attempt to mount /home/svn: Dec 18 20:06:36 shaggy amd[651]/map: Trying mount of freddy:/home/svn on /.amd_mnt/freddy/home/svn fstype nfs mount_type non-autofs Dec 18 20:14:20 shaggy amd[651]/warn: WARNING: automounter going down on signal 15 Notice that this "Trying mount" attempt did not obtain a response from the OS and perhaps amd is essentially locked up after that point. >From checking the log file, I see that there are 47 attempts to mount /home/svn since the upgrade to 8.0 and no such attempts with FreeBSD 7.2. It is difficult for me to tell if this is some issue with am-utils, or kernel code. It seems that am-utils was updated for 8.0 and there have been changes in the area of NFS as well in order to accommodate the new NFSv4 implementation. Any ideas? Bob -- Bob Friesenhahn bfriesen@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/