From owner-freebsd-fs@FreeBSD.ORG Mon May 16 11:07:03 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C5D7D106564A for ; Mon, 16 May 2011 11:07:03 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id A9D208FC21 for ; Mon, 16 May 2011 11:07:03 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p4GB73om071199 for ; Mon, 16 May 2011 11:07:03 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p4GB7253071197 for freebsd-fs@FreeBSD.org; Mon, 16 May 2011 11:07:02 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 16 May 2011 11:07:02 GMT Message-Id: <201105161107.p4GB7253071197@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2011 11:07:03 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/156933 fs [zfs] ZFS receive after read on readonly=on filesystem o kern/156797 fs [zfs] [panic] Double panic with FreeBSD 9-CURRENT and o kern/156781 fs [zfs] zfs is losing the snapshot directory, p kern/156545 fs [ufs] mv could break UFS on SMP systems o kern/156193 fs [ufs] [hang] UFS snapshot hangs && deadlocks processes o kern/156168 fs [nfs] [panic] Kernel panic under concurrent access ove o kern/156039 fs [nullfs] [unionfs] nullfs + unionfs do not compose, re o kern/155615 fs [zfs] zfs v28 broken on sparc64 -current o kern/155587 fs [zfs] [panic] kernel panic with zfs o kern/155484 fs [ufs] GPT + UFS boot don't work well together o kern/155411 fs [regression] [8.2-release] [tmpfs]: mount: tmpfs : No o kern/155199 fs [ext2fs] ext3fs mounted as ext2fs gives I/O errors o bin/155104 fs [zfs][patch] use /dev prefix by default when importing o kern/154930 fs [zfs] cannot delete/unlink file from full volume -> EN o kern/154828 fs [msdosfs] Unable to create directories on external USB o kern/154491 fs [smbfs] smb_co_lock: recursive lock for object 1 o kern/154447 fs [zfs] [panic] Occasional panics - solaris assert somew p kern/154228 fs [md] md getting stuck in wdrain state o kern/153996 fs [zfs] zfs root mount error while kernel is not located o kern/153847 fs [nfs] [panic] Kernel panic from incorrect m_free in nf o kern/153753 fs [zfs] ZFS v15 - grammatical error when attempting to u o kern/153716 fs [zfs] zpool scrub time remaining is incorrect o kern/153695 fs [patch] [zfs] Booting from zpool created on 4k-sector o kern/153680 fs [xfs] 8.1 failing to mount XFS partitions o kern/153520 fs [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable o kern/153418 fs [zfs] [panic] Kernel Panic occurred writing to zfs vol o kern/153351 fs [zfs] locking directories/files in ZFS o bin/153258 fs [patch][zfs] creating ZVOLs requires `refreservation' s kern/153173 fs [zfs] booting from a gzip-compressed dataset doesn't w o kern/153126 fs [zfs] vdev failure, zpool=peegel type=vdev.too_small p kern/152488 fs [tmpfs] [patch] mtime of file updated when only inode o kern/152022 fs [nfs] nfs service hangs with linux client [regression] o kern/151942 fs [zfs] panic during ls(1) zfs snapshot directory o kern/151905 fs [zfs] page fault under load in /sbin/zfs o kern/151845 fs [smbfs] [patch] smbfs should be upgraded to support Un o bin/151713 fs [patch] Bug in growfs(8) with respect to 32-bit overfl o kern/151648 fs [zfs] disk wait bug o kern/151629 fs [fs] [patch] Skip empty directory entries during name o kern/151330 fs [zfs] will unshare all zfs filesystem after execute a o kern/151326 fs [nfs] nfs exports fail if netgroups contain duplicate o kern/151251 fs [ufs] Can not create files on filesystem with heavy us o kern/151226 fs [zfs] can't delete zfs snapshot o kern/151111 fs [zfs] vnodes leakage during zfs unmount o kern/150503 fs [zfs] ZFS disks are UNAVAIL and corrupted after reboot o kern/150501 fs [zfs] ZFS vdev failure vdev.bad_label on amd64 o kern/150390 fs [zfs] zfs deadlock when arcmsr reports drive faulted o kern/150336 fs [nfs] mountd/nfsd became confused; refused to reload n o kern/150207 fs zpool(1): zpool import -d /dev tries to open weird dev o kern/149208 fs mksnap_ffs(8) hang/deadlock o kern/149173 fs [patch] [zfs] make OpenSolaris installa o kern/149015 fs [zfs] [patch] misc fixes for ZFS code to build on Glib o kern/149014 fs [zfs] [patch] declarations in ZFS libraries/utilities o kern/149013 fs [zfs] [patch] make ZFS makefiles use the libraries fro o kern/148504 fs [zfs] ZFS' zpool does not allow replacing drives to be o kern/148490 fs [zfs]: zpool attach - resilver bidirectionally, and re o kern/148368 fs [zfs] ZFS hanging forever on 8.1-PRERELEASE o bin/148296 fs [zfs] [loader] [patch] Very slow probe in /usr/src/sys o kern/148204 fs [nfs] UDP NFS causes overload o kern/148138 fs [zfs] zfs raidz pool commands freeze o kern/147903 fs [zfs] [panic] Kernel panics on faulty zfs device o kern/147881 fs [zfs] [patch] ZFS "sharenfs" doesn't allow different " o kern/147790 fs [zfs] zfs set acl(mode|inherit) fails on existing zfs o kern/147560 fs [zfs] [boot] Booting 8.1-PRERELEASE raidz system take o kern/147420 fs [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt o kern/146941 fs [zfs] [panic] Kernel Double Fault - Happens constantly o kern/146786 fs [zfs] zpool import hangs with checksum errors o kern/146708 fs [ufs] [panic] Kernel panic in softdep_disk_write_compl o kern/146528 fs [zfs] Severe memory leak in ZFS on i386 o kern/146502 fs [nfs] FreeBSD 8 NFS Client Connection to Server s kern/145712 fs [zfs] cannot offline two drives in a raidz2 configurat o kern/145411 fs [xfs] [panic] Kernel panics shortly after mounting an o bin/145309 fs bsdlabel: Editing disk label invalidates the whole dev o kern/145272 fs [zfs] [panic] Panic during boot when accessing zfs on o kern/145246 fs [ufs] dirhash in 7.3 gratuitously frees hashes when it o kern/145238 fs [zfs] [panic] kernel panic on zpool clear tank o kern/145229 fs [zfs] Vast differences in ZFS ARC behavior between 8.0 o kern/145189 fs [nfs] nfsd performs abysmally under load o kern/144929 fs [ufs] [lor] vfs_bio.c + ufs_dirhash.c p kern/144447 fs [zfs] sharenfs fsunshare() & fsshare_main() non functi o kern/144416 fs [panic] Kernel panic on online filesystem optimization s kern/144415 fs [zfs] [panic] kernel panics on boot after zfs crash o kern/144234 fs [zfs] Cannot boot machine with recent gptzfsboot code o kern/143825 fs [nfs] [panic] Kernel panic on NFS client o bin/143572 fs [zfs] zpool(1): [patch] The verbose output from iostat o kern/143212 fs [nfs] NFSv4 client strange work ... o kern/143184 fs [zfs] [lor] zfs/bufwait LOR o kern/142914 fs [zfs] ZFS performance degradation over time o kern/142878 fs [zfs] [vfs] lock order reversal o kern/142597 fs [ext2fs] ext2fs does not work on filesystems with real o kern/142489 fs [zfs] [lor] allproc/zfs LOR o kern/142466 fs Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re o kern/142306 fs [zfs] [panic] ZFS drive (from OSX Leopard) causes two o kern/142068 fs [ufs] BSD labels are got deleted spontaneously o kern/141897 fs [msdosfs] [panic] Kernel panic. msdofs: file name leng o kern/141463 fs [nfs] [panic] Frequent kernel panics after upgrade fro o kern/141305 fs [zfs] FreeBSD ZFS+sendfile severe performance issues ( o kern/141091 fs [patch] [nullfs] fix panics with DIAGNOSTIC enabled o kern/141086 fs [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS o kern/141010 fs [zfs] "zfs scrub" fails when backed by files in UFS2 o kern/140888 fs [zfs] boot fail from zfs root while the pool resilveri o kern/140661 fs [zfs] [patch] /boot/loader fails to work on a GPT/ZFS- o kern/140640 fs [zfs] snapshot crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs p bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/138662 fs [panic] ffs_blkfree: freeing free block o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic p kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/127787 fs [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs o bin/127270 fs fsck_msdosfs(8) may crash if BytesPerSec is zero o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file o kern/125895 fs [ffs] [panic] kernel: panic: ffs_blkfree: freeing free s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs [ufs] mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o kern/117954 fs [ufs] dirhash on very large directories blocks the mac o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o conf/116931 fs lack of fsck_cd9660 prevents mounting iso images with o kern/116583 fs [ffs] [hang] System freezes for short time when using f kern/116170 fs [panic] Kernel panic when mounting /tmp o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106107 fs [ufs] left-over fsck_snapshot after unfinished backgro f kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes s bin/97498 fs [request] newfs(8) has no option to clear the first 12 o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [cd9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o bin/94810 fs fsck(8) incorrectly reports 'file system marked clean' o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88555 fs [panic] ffs_blkfree: freeing free frag on AMD 64 o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o bin/87966 fs [patch] newfs(8): introduce -A flag for newfs to enabl o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o bin/85494 fs fsck_ffs: unchecked use of cg_inosused macro etc. o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o bin/74779 fs Background-fsck checks one filesystem twice and omits o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o bin/70600 fs fsck(8) throws files away when it can't grow lost+foun o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/33464 fs [ufs] soft update inconsistencies after system crash o bin/27687 fs fsck(8) wrapper is not properly passing options to fsc o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 223 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon May 16 14:18:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BC6B3106566C for ; Mon, 16 May 2011 14:18:30 +0000 (UTC) (envelope-from universite@ukr.net) Received: from otrada.od.ua (universite-1-pt.tunnel.tserv24.sto1.ipv6.he.net [IPv6:2001:470:27:140::2]) by mx1.freebsd.org (Postfix) with ESMTP id 2A8558FC17 for ; Mon, 16 May 2011 14:18:29 +0000 (UTC) Received: from [IPv6:2001:470:28:140:c6b:1b7d:99a2:c4a2] ([IPv6:2001:470:28:140:c6b:1b7d:99a2:c4a2]) (authenticated bits=0) by otrada.od.ua (8.14.4/8.14.4) with ESMTP id p4GEINSg043100 for ; Mon, 16 May 2011 17:18:24 +0300 (EEST) (envelope-from universite@ukr.net) Message-ID: <4DD13225.6090802@ukr.net> Date: Mon, 16 May 2011 17:18:13 +0300 From: "Vladislav V. Prodan" User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; ru; rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-92.0 required=5.0 tests=FREEMAIL_FROM,FSL_RU_URL, RDNS_NONE,SPF_SOFTFAIL,TO_NO_BRKTS_DIRECT,T_TO_NO_BRKTS_FREEMAIL, USER_IN_WHITELIST autolearn=no version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mary-teresa.otrada.od.ua X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (otrada.od.ua [IPv6:2001:470:28:140::5]); Mon, 16 May 2011 17:18:28 +0300 (EEST) Subject: The problem with backing up ZFS snapshots X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2011 14:18:30 -0000 I use a script that snapshots to backup with a working pool of the reserve pool of ZFS. https://gist.github.com/971271 zroot/$fs -->> tank/backup/zroot/$fs # zfs list | grep mysql tank/backup/zroot/mysql 2,21G 843G 612M /backup/zroot/mysql zroot/mysql 2,12G 438G 2,07G /var/db/mysql zroot/mysql/ibdata 10,3M 438G 10,0M /var/db/mysql/ibdata zroot/mysql/iblogs 11,2M 438G 10,0M /var/db/mysql/iblogs When I copy the partition /mysql, without embedded zroot/mysql/ibdata and zroot/mysql/iblogs, they fall off. [23:09]mary-teresa:root->db/mysql# ll | more total 2134129 drwx------ 2 mysql mysql 12 3 май 00:12 auth drwx------ 2 mysql mysql 147 3 май 00:12 cacti drwxr-xr-x 2 root wheel 2 20 апр 00:19 ibdata drwxr-xr-x 2 root wheel 2 20 апр 00:19 iblogs Only helps the manual removal of empty directories ibdata and iblogs and unmounting these filesystems and reassembly:: zfs umount zroot/mysql/ibdata zfs umount zroot/mysql/iblogs zfs mount -a # FreeBSD 8.2-STABLE #0: Wed Apr 20 03:20:47 EEST 2011 amd64 -- Vladislav V. Prodan VVP24-UANIC +380[67]4584408 +380[99]4060508 vlad11@jabber.ru From owner-freebsd-fs@FreeBSD.ORG Mon May 16 23:58:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C2AFD106564A for ; Mon, 16 May 2011 23:58:36 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 7E6648FC14 for ; Mon, 16 May 2011 23:58:36 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: Ap0EAOi50U2DaFvO/2dsb2JhbAAwhCmiNacJjiCRNoErgWyBe4EHBJARhyuHZg X-IronPort-AV: E=Sophos;i="4.65,222,1304308800"; d="scan'208";a="120899282" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 16 May 2011 19:58:35 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 37C4DB3EB1 for ; Mon, 16 May 2011 19:58:35 -0400 (EDT) Date: Mon, 16 May 2011 19:58:35 -0400 (EDT) From: Rick Macklem To: FreeBSD FS Message-ID: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Subject: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 16 May 2011 23:58:36 -0000 Hi, Down the road, I would like the NFS server to be able to do a VFS_FHTOVP(mp, &fhp->fh_fid, LK_SHARED, vpp); similar to what is already supported for VFS_VGET(). The reason is that, currently, when a client does read-aheads, these reads are basically serialized because the VFS_FHTOVP() gets an LK_EXCLUSIVE locked vnode for each RPC on the server. Like VFS_VGET(), the underlying file system can still choose to return a LK_EXCLUSIVE locked vnode even when LK_SHARED is specified. (Some file systems, such as FFS, just call VFS_VGET() in VFS_FHTOVP(), so all that happens is that the flag is passed through to VFS_VGET() for those ones.) To minimize the risk of the patch breaking something, I have it setting LK_EXCLUSIVE for all VFS_FHTOVP() calls so that the semantics don't actually change. (Changing the NFS server to use LK_SHARED is a trivial patch, but will need extensive testing, so I'm not planning on that change for 9.0.) If you are interested, my current patch is at: http://people.freebsd.org/~rmacklem/fhtovp.patch So, does this sound like a reasonable thing to commit, once the patch is reviewed? rick From owner-freebsd-fs@FreeBSD.ORG Tue May 17 09:20:20 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 1BFEB1065670 for ; Tue, 17 May 2011 09:20:20 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id 8BBA38FC08 for ; Tue, 17 May 2011 09:20:18 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p4H9KB2t051154 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 17 May 2011 12:20:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p4H9KBDh042104; Tue, 17 May 2011 12:20:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p4H9KBnK042103; Tue, 17 May 2011 12:20:11 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Tue, 17 May 2011 12:20:11 +0300 From: Kostik Belousov To: Rick Macklem Message-ID: <20110517092011.GK48734@deviant.kiev.zoral.com.ua> References: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="tt/ITYRGFe52qw7n" Content-Disposition: inline In-Reply-To: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: FreeBSD FS Subject: Re: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 09:20:20 -0000 --tt/ITYRGFe52qw7n Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, May 16, 2011 at 07:58:35PM -0400, Rick Macklem wrote: > Hi, >=20 > Down the road, I would like the NFS server to be able to do a > VFS_FHTOVP(mp, &fhp->fh_fid, LK_SHARED, vpp); > similar to what is already supported for VFS_VGET(). The reason > is that, currently, when a client does read-aheads, these reads are > basically serialized because the VFS_FHTOVP() gets an LK_EXCLUSIVE > locked vnode for each RPC on the server. >=20 > Like VFS_VGET(), the underlying file system can still choose to > return a LK_EXCLUSIVE locked vnode even when LK_SHARED is specified. > (Some file systems, such as FFS, just call VFS_VGET() in VFS_FHTOVP(), > so all that happens is that the flag is passed through to VFS_VGET() > for those ones.) Yes, the flag to specify the locking mode does only specify the minimal locking requirements, and filesystem is allowed to upgrade it to the more strict lock type. E.g. UFS would only return shared lock if the vnode was found in hash, AFAIR. If not told otherwise, getnewvnode(9) forces lockmgr to convert all lock requests into exclusive. >=20 > To minimize the risk of the patch breaking something, I have it setting > LK_EXCLUSIVE for all VFS_FHTOVP() calls so that the semantics don't > actually change. (Changing the NFS server to use LK_SHARED is a trivial > patch, but will need extensive testing, so I'm not planning on that > change for 9.0.) >=20 > If you are interested, my current patch is at: > http://people.freebsd.org/~rmacklem/fhtovp.patch >=20 > So, does this sound like a reasonable thing to commit, once the patch > is reviewed? Sure, please do it before the code slush. --tt/ITYRGFe52qw7n Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk3SPcoACgkQC3+MBN1Mb4iLDgCgzybKN06HfY7h5tMg2BxX+iVh KLkAnij6Gjq5oy6+vRqQHO4ZOwHWKpBC =O73t -----END PGP SIGNATURE----- --tt/ITYRGFe52qw7n-- From owner-freebsd-fs@FreeBSD.ORG Tue May 17 09:36:45 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 35B5D1065674 for ; Tue, 17 May 2011 09:36:45 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id E138D8FC15 for ; Tue, 17 May 2011 09:36:44 +0000 (UTC) Received: by qwc9 with SMTP id 9so193173qwc.13 for ; Tue, 17 May 2011 02:36:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc :content-type; bh=MhIlTYRbAgpXtz0FsJldYRkOt/V0yFQAxpCnwUU8SA0=; b=FogjUcFlcMxoTutpp8l1/h8ECSlR5fDtlLsPwz1O8orZ42WB0D27y9aXzQPItE1H1e SrltSsE5WnVFlAOml12zBADCpSp7RPG6CIjFDOXIFEqj7tCtNs58qZPQDi1Chle6dDrA Tj91ktgV0dVbxavYztlw9jMNxGfboD7OAfqeI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=qUiHd6hQ2rWodOh//M6XCbk2ll1zi2TvoO1i0cDaWVGSDli87fktw1ryqUrsc3yS7E X26fNewIqg5ew8AmGOug4HWd0xgJzcaq5UTRlOfNetre5I/JLmtt8GgcDMGqIi4G/MvB vYjzDVLdHMVeu6xCbz/N0KH4/zR+vzy75CqyA= MIME-Version: 1.0 Received: by 10.229.181.142 with SMTP id by14mr267219qcb.247.1305625003915; Tue, 17 May 2011 02:36:43 -0700 (PDT) Received: by 10.229.111.218 with HTTP; Tue, 17 May 2011 02:36:43 -0700 (PDT) Date: Tue, 17 May 2011 13:36:43 +0400 Message-ID: From: Sergey Kandaurov To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: [old nfsclient] different nmount() args passed from mount vs. mount_nfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 09:36:45 -0000 Hi. First, sorry for the long mail. I just tried to describe in full details. When mounting nfs with some options, I found that /sbin/mount and /sbin/mount_nfs pass options to nmount() differently, which results in bad things (TM). I traced the options and here they are: >From mount(8) -> mount_nfs(8): "rw" -> "" "addr" -> {something valid } "fh" -> 5 "sec" -> "sys" "nfsv3" -> 0x0 => NFSMNT_NFSV3 "hostname" -> "dev2.mail:/home/svn/freebsd/head" "fstype" -> "oldnfs" "fspath" -> "/usr/src" "errmsg" -> "" (nil) >From pre-r221124 mount(8): = "fstype" -> "oldnfs" "hostname" -> "dev2.mail" = "fspath" -> "/usr/src" "from" -> "dev2.mail:/home/svn/freebsd/head" = "errmsg" -> "" (nil) Note, that pre-r221124 mount(8) knows nothing about oldnfs. 1. "hostname" option is passed differently from mount(8) and mount_nfs(8). When I force to mount oldnfs file system with mount(8) directly (to not bypass the nmount(2) call to mount_nfs(8)), I get this error: ./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument Hmm.. this may be because mount(8) passes value in $hostname:$path format (see the traces above). It might be due to different old nfsclient way to parse args, but I am not sure, I can be wrong. Anyway, it does not matter now. The actual problem manifests when running the command with pre-r221124 mount(8) binary. It knows nothing about "oldnfs" and (attention!) calls nmount(2) directly instead of bypassing the call to the mount_nfs(8) binary as usually done, and this is the place where the "unsanitized nmount(2) args" problem is hidden. [New mount knows about "oldnfs" and passes the call to mount_oldnfs(8) that prepares all the nmount(2) args to correctly hide the problem.] To prove it, that is how old and new mount(8) work differently: 1) new mount(8) as of current mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src 2) old mount(8) as of pre-r221124 ./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src Ok, back to the first paragraph: a different "hostname" mount option. When I first faced with this, I tried to specify value for "hostname" explicitly. Here it comes: ./mount -t oldnfs -o hostname=dev2.mail dev2.mail:/home/svn/freebsd/head /usr/src [CABOOM!] It just crashed. Do not do this :) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x1 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff805da299 stack pointer = 0x28:0xffffff807bef6240 frame pointer = 0x28:0xffffff807bef62a0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 2541 (mount) db> bt Tracing pid 2541 tid 100076 td 0xfffffe0001ace460 nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79 nfs_request() at 0xffffffff805da978 = nfs_request+0x398 nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3 mountnfs() at 0xffffffff805de739 = mountnfs+0x329 nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7 vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f nmount() at 0xffffffff804d54f3 = nmount+0x63 syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb syscall() at 0xffffffff806ae710 = syscall+0x60 Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp = 0x7fffffffca48, rbp = 0x801009058 --- As you might see from above nmount(2) args traces, mount(8) itself doesn't pass the "addr" option to the nmount(2) syscall while nfs_mount() expects to receive it, which is the problem. Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to dereference addr value and page faults here in nfs_connect() : vers = NFS_VER3; else if (nmp->nm_flag & NFSMNT_NFSV4) vers = NFS_VER4; XXX saddr is NULL, the next line will crash if (saddr->sa_family == AF_INET) if (nmp->nm_sotype == SOCK_DGRAM) nconf = getnetconfigent("udp"); I think that nfsclient, probably in sys/nfsclient/nfs_vfsops.c:mount_nfs(), should handle a missing value for "addr" and/or "fh" mount options. It doesn't check it currently: % static int % nfs_mount(struct mount *mp) % { % struct nfs_args args = { % [...] % .addr = NULL, % }; % int error, ret, has_nfs_args_opt; % int has_addr_opt, has_fh_opt, has_hostname_opt; % struct sockaddr *nam; addr is initialized with NULL. num used later as a pointer to args.addr value. % if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) { % error = nfs_mountroot(mp); % goto out; % } We do not try to mount root, this is not ours. % if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) { [...] % has_nfs_args_opt = 1; % } We do not use old mount(2) interface, not ours. % if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0) % args.flags |= NFSMNT_NFSV3; mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set. % if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr, % &args.addrlen) == 0) { % has_addr_opt = 1; % if (args.addrlen > SOCK_MAXADDRLEN) { % error = ENAMETOOLONG; % goto out; % } % nam = malloc(args.addrlen, M_SONAME, % M_WAITOK); % bcopy(args.addr, nam, args.addrlen); % nam->sa_len = args.addrlen; % } mount(8) doesn't pass addr option, so args.addr isn't set, hence struct sockaddr *nam is also NULL, has_addr_opt is 0. % if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname, % NULL) == 0) { % has_hostname_opt = 1; % } % if (args.hostname == NULL) { % vfs_mount_error(mp, "Invalid hostname"); % error = EINVAL; % goto out; % } I don't know why I got here the error. I didn't analyze it deep though. "mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument" % if (mp->mnt_flag & MNT_UPDATE) { [...] That's not update case, it's not ours. % if (has_nfs_args_opt) { has_nfs_args_opt is 0, as we don't use legacy mount(2) interface, see above. So, the whole block is ignored. Though, see below. % /* % * In the 'nfs_args' case, the pointers in the args % * structure are in userland - we copy them in here. % */ % if (!has_fh_opt) { % error = copyin((caddr_t)args.fh, (caddr_t)nfh, % args.fhsize); % if (error) { % goto out; % } % args.fh = nfh; % } has_fh_opt is 0, as mount(8) didn't pass "fh" to nmount(2), though this part is not executed anyway. % if (!has_hostname_opt) { % error = copyinstr(args.hostname, hst, MNAMELEN-1, &len) % if (error) { % goto out; % } % bzero(&hst[len], MNAMELEN - len); % args.hostname = hst; has_hostname_opt is 1, as mount(8) passes "hostname" to nmount(2), though this part is not executed anyway. % } % if (!has_addr_opt) { % /* sockargs() call must be after above copyin() calls * % printf("args.addr: %p\n", args.addr); % error = getsockaddr(&nam, (caddr_t)args.addr, % args.addrlen); % printf("error: %d\n", error); % if (error) { % goto out; % } % } has_addr_opt is 0, as mount(8) didn't pass "addr" to nmount(2), though this part is not executed anyway. % } % error = mountnfs(&args, mp, nam, args.hostname, &vp, % curthread->td_ucred, negnametimeo); mountnfs() is called with nam == NULL, then it crashes deep in /sys/nfsclient/nfs_krpc.c:nfs_connect(). Also compare ddb backtrace with one from new mount(8) which bypasses the call to mount_nfs(8). I got it by adding kdb_enter() just before NULL pointer dereference. db> bt Tracing pid 2143 tid 100117 td 0xfffffe0001c58000 kdb_enter() at 0xffffffff80477d1b = kdb_enter+0x3b nfs_connect() at 0xffffffff805da7e8 = nfs_connect+0x88 nfs_request() at 0xffffffff805daec8 = nfs_request+0x398 nfs_fsinfo() at 0xffffffff805ddec0 = nfs_fsinfo+0xd0 mountnfs() at 0xffffffff805ded44 = mountnfs+0x3e4 nfs_mount() at 0xffffffff805e051f = nfs_mount+0xcff vfs_donmount() at 0xffffffff804d5092 = vfs_donmount+0xc92 nmount() at 0xffffffff804d5a33 = nmount+0x63 syscallenter() at 0xffffffff804866eb = syscallenter+0x1cb syscall() at 0xffffffff806aec90 = syscall+0x60 Xfast_syscall() at 0xffffffff806997ad = Xfast_syscall+0xdd --- syscall (378, FreeBSD ELF64, nmount), rip = 0x8008a544c, rsp = 0x7fffffffd258, rbp = 0x7fffffffd30c --- Two backtraces different slightly because of NFSMNT_NFSV3 is not set in the old mount(8) case. From sys/nfsclient/nfs_vfsops.c:mountnfs() if (argp->flags & NFSMNT_NFSV3) nfs_fsinfo(nmp, *vpp, curthread->td_ucred, curthread); else VOP_GETATTR(*vpp, &attrs, curthread->td_ucred); -- wbr, pluknet From owner-freebsd-fs@FreeBSD.ORG Tue May 17 19:33:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 328A11065674 for ; Tue, 17 May 2011 19:33:54 +0000 (UTC) (envelope-from a.smith@ukgrid.net) Received: from mx1.ukgrid.net (mx1.ukgrid.net [89.107.22.36]) by mx1.freebsd.org (Postfix) with ESMTP id F1D628FC21 for ; Tue, 17 May 2011 19:33:53 +0000 (UTC) Received: from [89.21.28.38] (port=39435 helo=omicron.ukgrid.net) by mx1.ukgrid.net with esmtp (Exim 4.74; FreeBSD) envelope-from a.smith@ukgrid.net envelope-to freebsd-fs@freebsd.org id 1QMPe8-000Kbw-5p; Tue, 17 May 2011 20:09:32 +0100 Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com [81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP; Tue, 17 May 2011 20:09:32 +0100 Message-ID: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> Date: Tue, 17 May 2011 20:09:32 +0100 From: a.smith@ukgrid.net To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1 Subject: zfs get all command hung X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 19:33:54 -0000 Hi, I have a script that runs every hour, one of the commands it runs is "zfs get all mypool". The process has hung and cannot be killed. Is there anything I can do to work out what happened? This has happened before, but on older OS releases. The system is FreeBSD 8.2-RELEASE amd64. A truss of the process just shows nothing, thanks Andy. From owner-freebsd-fs@FreeBSD.ORG Tue May 17 19:35:27 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4F011106566B for ; Tue, 17 May 2011 19:35:27 +0000 (UTC) (envelope-from a.smith@ukgrid.net) Received: from mx0.ukgrid.net (mx0.ukgrid.net [89.21.28.41]) by mx1.freebsd.org (Postfix) with ESMTP id 13AC28FC18 for ; Tue, 17 May 2011 19:35:26 +0000 (UTC) Received: from [89.21.28.38] (port=11959 helo=omicron.ukgrid.net) by mx0.ukgrid.net with esmtp (Exim 4.74; FreeBSD) envelope-from a.smith@ukgrid.net envelope-to freebsd-fs@freebsd.org id 1QMPga-000C4y-Cz; Tue, 17 May 2011 20:12:04 +0100 Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com [81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP; Tue, 17 May 2011 20:12:03 +0100 Message-ID: <20110517201203.1813683kuqivzwws@webmail2.ukgrid.net> Date: Tue, 17 May 2011 20:12:03 +0100 From: a.smith@ukgrid.net To: freebsd-fs@freebsd.org References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> In-Reply-To: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1 Subject: Re: zfs get all command hung X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 19:35:27 -0000 PS the pool is live and up and running read/write, just any zfs get command is hanging... Quoting a.smith@ukgrid.net: > Hi, > > I have a script that runs every hour, one of the commands it runs > is "zfs get all mypool". The process has hung and cannot be killed. > Is there anything I can do to work out what happened? This has > happened before, but on older OS releases. The system is FreeBSD > 8.2-RELEASE amd64. A truss of the process just shows nothing, > > thanks Andy. > > > > > From owner-freebsd-fs@FreeBSD.ORG Tue May 17 20:41:54 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2E03F106564A for ; Tue, 17 May 2011 20:41:54 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 5A4A58FC20 for ; Tue, 17 May 2011 20:41:53 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA03690; Tue, 17 May 2011 23:23:31 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1QMQnj-0007eK-CS; Tue, 17 May 2011 23:23:31 +0300 Message-ID: <4DD2D942.9030600@FreeBSD.org> Date: Tue, 17 May 2011 23:23:30 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110503 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: a.smith@ukgrid.net References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> In-Reply-To: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@FreeBSD.org Subject: Re: zfs get all command hung X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 20:41:54 -0000 on 17/05/2011 22:09 a.smith@ukgrid.net said the following: > Hi, > > I have a script that runs every hour, one of the commands it runs is "zfs get > all mypool". The process has hung and cannot be killed. Is there anything I can > do to work out what happened? This has happened before, but on older OS > releases. The system is FreeBSD 8.2-RELEASE amd64. A truss of the process just > shows nothing, procstat -kk -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Tue May 17 20:54:13 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D0E731065674 for ; Tue, 17 May 2011 20:54:13 +0000 (UTC) (envelope-from a.smith@ukgrid.net) Received: from mx0.ukgrid.net (mx0.ukgrid.net [89.21.28.41]) by mx1.freebsd.org (Postfix) with ESMTP id 8F2618FC0C for ; Tue, 17 May 2011 20:54:13 +0000 (UTC) Received: from [89.21.28.38] (port=46293 helo=omicron.ukgrid.net) by mx0.ukgrid.net with esmtp (Exim 4.74; FreeBSD) envelope-from a.smith@ukgrid.net id 1QMRHQ-00030o-EL; Tue, 17 May 2011 21:54:12 +0100 Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com [81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP; Tue, 17 May 2011 21:54:12 +0100 Message-ID: <20110517215412.879621won3gxj4v4@webmail2.ukgrid.net> Date: Tue, 17 May 2011 21:54:12 +0100 From: a.smith@ukgrid.net To: Andriy Gapon References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> <4DD2D942.9030600@FreeBSD.org> In-Reply-To: <4DD2D942.9030600@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1 Cc: freebsd-fs@FreeBSD.org Subject: Re: zfs get all command hung X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 20:54:13 -0000 Quoting Andriy Gapon : > > procstat -kk > # procstat -kk 37975 PID TID COMM TDNAME KSTACK 37975 100669 zfs - mi_switch+0x176 sleepq_catch_signals+0x29e sleepq_wait_sig+0x16 _sleep+0x269 clnt_vc_create+0x153 clnt_reconnect_call+0x64d nfs_request+0x215 nfs_statfs+0x194 __vfs_statfs+0x28 kern_getfsstat+0x3fc syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 And actually I was thinking, as the all zfs get commands are hanging, I can run others and truss them of course. Here is the tail of a truss: NAME PROPERTY VALUE SOURCE write(1,"NAME PROPERTY VALU"...,58) = 58 (0x3a) mx1 type filesystem - write(1,"mx1 type file"...,53) = 53 (0x35) mx1 creation Mon Jan 17 12:08 2011 - write(1,"mx1 creation Mon "...,53) = 53 (0x35) mx1 used 78.2G - write(1,"mx1 used 78.2"...,53) = 53 (0x35) mx1 available 195G - write(1,"mx1 available 195G"...,53) = 53 (0x35) mx1 referenced 22K - write(1,"mx1 referenced 22K "...,53) = 53 (0x35) mx1 compressratio 1.27x - write(1,"mx1 compressratio 1.27"...,53) = 53 (0x35) fstat(4,{ mode=crw-rw-rw- ,inode=32,size=0,blksize=4096 }) = 0 (0x0) ioctl(4,TIOCGETA,0xffffc8c0) ERR#19 'Operation not supported by device' lseek(4,0x0,SEEK_SET) = 0 (0x0) lseek(4,0x0,SEEK_CUR) = 0 (0x0) getfsstat(0x0,0x0,0x1,0x0,0x80,0xa008) = 443 (0x1bb) Andy. From owner-freebsd-fs@FreeBSD.ORG Tue May 17 21:17:17 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4FC1F1065672; Tue, 17 May 2011 21:17:17 +0000 (UTC) (envelope-from gpalmer@freebsd.org) Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1]) by mx1.freebsd.org (Postfix) with ESMTP id 1FFD48FC1A; Tue, 17 May 2011 21:17:17 +0000 (UTC) Received: from gjp by noop.in-addr.com with local (Exim 4.76 (FreeBSD)) (envelope-from ) id 1QMRdk-000K3p-4E; Tue, 17 May 2011 17:17:16 -0400 Date: Tue, 17 May 2011 17:17:16 -0400 From: Gary Palmer To: a.smith@ukgrid.net Message-ID: <20110517211716.GD37035@in-addr.com> References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> <4DD2D942.9030600@FreeBSD.org> <20110517215412.879621won3gxj4v4@webmail2.ukgrid.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110517215412.879621won3gxj4v4@webmail2.ukgrid.net> X-SA-Exim-Connect-IP: X-SA-Exim-Mail-From: gpalmer@freebsd.org X-SA-Exim-Scanned: No (on noop.in-addr.com); SAEximRunCond expanded to false Cc: freebsd-fs@FreeBSD.org, Andriy Gapon Subject: Re: zfs get all command hung X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 21:17:17 -0000 On Tue, May 17, 2011 at 09:54:12PM +0100, a.smith@ukgrid.net wrote: > Quoting Andriy Gapon : > > > >procstat -kk > > > > # procstat -kk 37975 > PID TID COMM TDNAME KSTACK > 37975 100669 zfs - mi_switch+0x176 > sleepq_catch_signals+0x29e sleepq_wait_sig+0x16 _sleep+0x269 > clnt_vc_create+0x153 clnt_reconnect_call+0x64d nfs_request+0x215 > nfs_statfs+0x194 __vfs_statfs+0x28 kern_getfsstat+0x3fc > syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2 > > And actually I was thinking, as the all zfs get commands are hanging, > I can run others and truss them of course. Here is the tail of a truss: > > > NAME PROPERTY VALUE SOURCE > write(1,"NAME PROPERTY VALU"...,58) = 58 (0x3a) > mx1 type filesystem - > write(1,"mx1 type file"...,53) = 53 (0x35) > mx1 creation Mon Jan 17 12:08 2011 - > write(1,"mx1 creation Mon "...,53) = 53 (0x35) > mx1 used 78.2G - > write(1,"mx1 used 78.2"...,53) = 53 (0x35) > mx1 available 195G - > write(1,"mx1 available 195G"...,53) = 53 (0x35) > mx1 referenced 22K - > write(1,"mx1 referenced 22K "...,53) = 53 (0x35) > mx1 compressratio 1.27x - > write(1,"mx1 compressratio 1.27"...,53) = 53 (0x35) > fstat(4,{ mode=crw-rw-rw- ,inode=32,size=0,blksize=4096 }) = 0 (0x0) > ioctl(4,TIOCGETA,0xffffc8c0) ERR#19 'Operation not > supported by device' > lseek(4,0x0,SEEK_SET) = 0 (0x0) > lseek(4,0x0,SEEK_CUR) = 0 (0x0) > getfsstat(0x0,0x0,0x1,0x0,0x80,0xa008) = 443 (0x1bb) I'm no expert, but it looks more like you have a NFS filesystem mounted on the system and for some reason system calls to list the mounted filesystems are hanging due to the NFS mount. Is there a NFS filesystem mounted on that box and is the NFS server available and responding to NFS requests? Gary From owner-freebsd-fs@FreeBSD.ORG Tue May 17 21:33:06 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 471D1106564A; Tue, 17 May 2011 21:33:06 +0000 (UTC) (envelope-from a.smith@ukgrid.net) Received: from mx1.ukgrid.net (mx1.ukgrid.net [89.107.22.36]) by mx1.freebsd.org (Postfix) with ESMTP id 0A7F48FC12; Tue, 17 May 2011 21:33:05 +0000 (UTC) Received: from [89.21.28.38] (port=51540 helo=omicron.ukgrid.net) by mx1.ukgrid.net with esmtp (Exim 4.74; FreeBSD) envelope-from a.smith@ukgrid.net id 1QMRt3-000GnQ-3Y; Tue, 17 May 2011 22:33:05 +0100 Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com [81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP; Tue, 17 May 2011 22:33:04 +0100 Message-ID: <20110517223304.10337hhl7w2hz4g8@webmail2.ukgrid.net> Date: Tue, 17 May 2011 22:33:04 +0100 From: a.smith@ukgrid.net To: Gary Palmer References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net> <4DD2D942.9030600@FreeBSD.org> <20110517215412.879621won3gxj4v4@webmail2.ukgrid.net> <20110517211716.GD37035@in-addr.com> In-Reply-To: <20110517211716.GD37035@in-addr.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; DelSp="Yes"; format="flowed" Content-Disposition: inline Content-Transfer-Encoding: 7bit User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1 Cc: freebsd-fs@FreeBSD.org, Andriy Gapon Subject: Re: zfs get all command hung X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2011 21:33:06 -0000 Quoting Gary Palmer : > > I'm no expert, but it looks more like you have a NFS filesystem mounted > on the system and for some reason system calls to list the mounted > filesystems are hanging due to the NFS mount. Is there a NFS filesystem > mounted on that box and is the NFS server available and responding to > NFS requests? > Hi Gary, yeah think you're spot on there! There is an NFS mount used for some backups, looks like our network guys have broken something today though, seems to be blocked on the firewall! thanks for the comment, Andy. From owner-freebsd-fs@FreeBSD.ORG Wed May 18 00:37:18 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F2F8C106564A for ; Wed, 18 May 2011 00:37:18 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca [131.104.91.44]) by mx1.freebsd.org (Postfix) with ESMTP id 8E80D8FC12 for ; Wed, 18 May 2011 00:37:18 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEAJET002DaFvO/2dsb2JhbACEWaI0iHCtWpB/hRKBBwSQEYcrh2Y X-IronPort-AV: E=Sophos;i="4.65,228,1304308800"; d="scan'208";a="125034784" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 17 May 2011 20:37:17 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 74164B3F28; Tue, 17 May 2011 20:37:17 -0400 (EDT) Date: Tue, 17 May 2011 20:37:17 -0400 (EDT) From: Rick Macklem To: Sergey Kandaurov Message-ID: <713535812.490291.1305679037413.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_Part_490290_2136409836.1305679037410" X-Originating-IP: [172.17.91.202] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692) Cc: freebsd-fs@freebsd.org Subject: Re: [old nfsclient] different nmount() args passed from mount vs. mount_nfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 00:37:19 -0000 ------=_Part_490290_2136409836.1305679037410 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit > Hi. > > First, sorry for the long mail. I just tried to describe in full > details. > > When mounting nfs with some options, I found that /sbin/mount and > /sbin/mount_nfs pass options to nmount() differently, which results > in bad things (TM). I traced the options and here they are: > > From mount(8) -> mount_nfs(8): > "rw" -> "" > "addr" -> {something valid } > "fh" -> 5 > "sec" -> "sys" > "nfsv3" -> 0x0 => NFSMNT_NFSV3 > "hostname" -> "dev2.mail:/home/svn/freebsd/head" > "fstype" -> "oldnfs" > "fspath" -> "/usr/src" > "errmsg" -> "" > (nil) > > From pre-r221124 mount(8): > = "fstype" -> "oldnfs" > "hostname" -> "dev2.mail" > = "fspath" -> "/usr/src" > "from" -> "dev2.mail:/home/svn/freebsd/head" > = "errmsg" -> "" > (nil) > > Note, that pre-r221124 mount(8) knows nothing about oldnfs. > > 1. "hostname" option is passed differently from mount(8) and > mount_nfs(8). > When I force to mount oldnfs file system with mount(8) directly (to > not > bypass the nmount(2) call to mount_nfs(8)), I get this error: > ./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src > mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid > argument > > Hmm.. this may be because mount(8) passes value in $hostname:$path > format > (see the traces above). It might be due to different old nfsclient way > to parse > args, but I am not sure, I can be wrong. Anyway, it does not matter > now. > > The actual problem manifests when running the command with pre-r221124 > mount(8) binary. It knows nothing about "oldnfs" and (attention!) > calls nmount(2) > directly instead of bypassing the call to the mount_nfs(8) binary as > usually done, > and this is the place where the "unsanitized nmount(2) args" problem > is hidden. > [New mount knows about "oldnfs" and passes the call to mount_oldnfs(8) > that > prepares all the nmount(2) args to correctly hide the problem.] > > To prove it, that is how old and new mount(8) work differently: > 1) new mount(8) as of current > mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src > exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src > 2) old mount(8) as of pre-r221124 > ./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src > mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src > > > Ok, back to the first paragraph: a different "hostname" mount option. > When I first faced with this, I tried to specify value for "hostname" > explicitly. Here it comes: > ./mount -t oldnfs -o hostname=dev2.mail > dev2.mail:/home/svn/freebsd/head /usr/src > [CABOOM!] > It just crashed. Do not do this :) > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x1 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff805da299 > stack pointer = 0x28:0xffffff807bef6240 > frame pointer = 0x28:0xffffff807bef62a0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 2541 (mount) > db> bt > Tracing pid 2541 tid 100076 td 0xfffffe0001ace460 > nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79 > nfs_request() at 0xffffffff805da978 = nfs_request+0x398 > nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc > VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3 > mountnfs() at 0xffffffff805de739 = mountnfs+0x329 > nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7 > vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f > nmount() at 0xffffffff804d54f3 = nmount+0x63 > syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb > syscall() at 0xffffffff806ae710 = syscall+0x60 > Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd > --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp = > 0x7fffffffca48, rbp = 0x801009058 --- > > > As you might see from above nmount(2) args traces, mount(8) itself > doesn't > pass the "addr" option to the nmount(2) syscall while nfs_mount() > expects to > receive it, which is the problem. > Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to > dereference > addr value and page faults here in nfs_connect() : > > vers = NFS_VER3; > else if (nmp->nm_flag & NFSMNT_NFSV4) > vers = NFS_VER4; > XXX saddr is NULL, the next line will crash > if (saddr->sa_family == AF_INET) > if (nmp->nm_sotype == SOCK_DGRAM) > nconf = getnetconfigent("udp"); > > I think that nfsclient, probably in > sys/nfsclient/nfs_vfsops.c:mount_nfs(), > should handle a missing value for "addr" and/or "fh" mount options. > It doesn't check it currently: > Yes, at least for the case of "addr". I'm not sure if a zero length fh is considered ok for the old client or not. (It is valid for the new one.) I've attached a patch that does the check for the "addr=" option for both clients. You can test that if you'd like. It should avoid the crash. Since "oldnfs" didn't exist as a file system type pre-r21124, I don't think you can expect a pre-r211124 mount to be able to be done for it. (It will work for the default "nfs", it will just use the new NFS client.) > % static int > % nfs_mount(struct mount *mp) > % { > % struct nfs_args args = { > % [...] > % .addr = NULL, > % }; > % int error, ret, has_nfs_args_opt; > % int has_addr_opt, has_fh_opt, has_hostname_opt; > % struct sockaddr *nam; > > addr is initialized with NULL. num used later as a pointer to > args.addr value. > > % if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) { > % error = nfs_mountroot(mp); > % goto out; > % } > > We do not try to mount root, this is not ours. > > % if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) { > [...] > % has_nfs_args_opt = 1; > % } > > We do not use old mount(2) interface, not ours. > > % if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0) > % args.flags |= NFSMNT_NFSV3; > > mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set. > > % if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr, > % &args.addrlen) == 0) { > % has_addr_opt = 1; > % if (args.addrlen > SOCK_MAXADDRLEN) { > % error = ENAMETOOLONG; > % goto out; > % } > % nam = malloc(args.addrlen, M_SONAME, > % M_WAITOK); > % bcopy(args.addr, nam, args.addrlen); > % nam->sa_len = args.addrlen; > % } > > mount(8) doesn't pass addr option, so args.addr isn't set, hence > struct sockaddr *nam is also NULL, has_addr_opt is 0. > > % if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname, > % NULL) == 0) { > % has_hostname_opt = 1; > % } > % if (args.hostname == NULL) { > % vfs_mount_error(mp, "Invalid hostname"); > % error = EINVAL; > % goto out; > % } > > I don't know why I got here the error. I didn't analyze it deep > though. > "mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid > argument" You'll get this if there is no hostname="xxx" argument specified, which I believe is correct. ------=_Part_490290_2136409836.1305679037410 Content-Type: text/x-patch; name=nfsmnt.patch Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename=nfsmnt.patch LS0tIG5mc2NsaWVudC9uZnNfdmZzb3BzLmMuc2F2CTIwMTEtMDUtMTcgMTk6NDg6MTUuMDAwMDAw MDAwIC0wNDAwCisrKyBuZnNjbGllbnQvbmZzX3Zmc29wcy5jCTIwMTEtMDUtMTcgMjA6MDA6NDYu MDAwMDAwMDAwIC0wNDAwCkBAIC0xMTQ5LDYgKzExNDksMTAgQEAgbmZzX21vdW50KHN0cnVjdCBt b3VudCAqbXApCiAJCQkJZ290byBvdXQ7CiAJCQl9CiAJCX0KKwl9IGVsc2UgaWYgKGhhc19hZGRy X29wdCA9PSAwKSB7CisJCXZmc19tb3VudF9lcnJvcihtcCwgIk5vIHNlcnZlciBhZGRyZXNzIik7 CisJCWVycm9yID0gRUlOVkFMOworCQlnb3RvIG91dDsKIAl9CiAJZXJyb3IgPSBtb3VudG5mcygm YXJncywgbXAsIG5hbSwgYXJncy5ob3N0bmFtZSwgJnZwLAogCSAgICBjdXJ0aHJlYWQtPnRkX3Vj cmVkLCBuZWduYW1ldGltZW8pOwotLS0gZnMvbmZzY2xpZW50L25mc19jbHZmc29wcy5jLnNhdgky MDExLTA1LTE3IDE4OjU2OjQ3LjAwMDAwMDAwMCAtMDQwMAorKysgZnMvbmZzY2xpZW50L25mc19j bHZmc29wcy5jCTIwMTEtMDUtMTcgMjA6MTA6NDcuMDAwMDAwMDAwIC0wNDAwCkBAIC0xMDc5LDE1 ICsxMDc5LDIxIEBAIG5mc19tb3VudChzdHJ1Y3QgbW91bnQgKm1wKQogCQlkaXJwYXRoWzBdID0g J1wwJzsKIAlkaXJsZW4gPSBzdHJsZW4oZGlycGF0aCk7CiAKLQlpZiAoaGFzX25mc19hcmdzX29w dCA9PSAwICYmIHZmc19nZXRvcHQobXAtPm1udF9vcHRuZXcsICJhZGRyIiwKLQkgICAgKHZvaWQg KiopJmFyZ3MuYWRkciwgJmFyZ3MuYWRkcmxlbikgPT0gMCkgewotCQlpZiAoYXJncy5hZGRybGVu ID4gU09DS19NQVhBRERSTEVOKSB7Ci0JCQllcnJvciA9IEVOQU1FVE9PTE9ORzsKKwlpZiAoaGFz X25mc19hcmdzX29wdCA9PSAwKSB7CisJCWlmICh2ZnNfZ2V0b3B0KG1wLT5tbnRfb3B0bmV3LCAi YWRkciIsCisJCSAgICAodm9pZCAqKikmYXJncy5hZGRyLCAmYXJncy5hZGRybGVuKSA9PSAwKSB7 CisJCQlpZiAoYXJncy5hZGRybGVuID4gU09DS19NQVhBRERSTEVOKSB7CisJCQkJZXJyb3IgPSBF TkFNRVRPT0xPTkc7CisJCQkJZ290byBvdXQ7CisJCQl9CisJCQluYW0gPSBtYWxsb2MoYXJncy5h ZGRybGVuLCBNX1NPTkFNRSwgTV9XQUlUT0spOworCQkJYmNvcHkoYXJncy5hZGRyLCBuYW0sIGFy Z3MuYWRkcmxlbik7CisJCQluYW0tPnNhX2xlbiA9IGFyZ3MuYWRkcmxlbjsKKwkJfSBlbHNlIHsK KwkJCXZmc19tb3VudF9lcnJvcihtcCwgIk5vIHNlcnZlciBhZGRyZXNzIik7CisJCQllcnJvciA9 IEVJTlZBTDsKIAkJCWdvdG8gb3V0OwogCQl9Ci0JCW5hbSA9IG1hbGxvYyhhcmdzLmFkZHJsZW4s IE1fU09OQU1FLCBNX1dBSVRPSyk7Ci0JCWJjb3B5KGFyZ3MuYWRkciwgbmFtLCBhcmdzLmFkZHJs ZW4pOwotCQluYW0tPnNhX2xlbiA9IGFyZ3MuYWRkcmxlbjsKIAl9CiAKIAlhcmdzLmZoID0gbmZo Owo= ------=_Part_490290_2136409836.1305679037410-- From owner-freebsd-fs@FreeBSD.ORG Wed May 18 06:29:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0DD0D106564A for ; Wed, 18 May 2011 06:29:50 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id 5F2928FC0A for ; Wed, 18 May 2011 06:29:50 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id 12B8BC01C5 for ; Wed, 18 May 2011 08:13:15 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ell1upkySAWh; Wed, 18 May 2011 08:13:14 +0200 (CEST) Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se [213.89.160.61]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 5033DC01B4; Wed, 18 May 2011 08:13:14 +0200 (CEST) From: Per von Zweigbergk Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Date: Wed, 18 May 2011 08:13:13 +0200 To: freebsd-fs@freebsd.org Message-Id: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> Mime-Version: 1.0 (Apple Message framework v1082) X-Mailer: Apple Mail (2.1082) Subject: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 06:29:51 -0000 I've been investigating HAST as a possibility in adding synchronous = replication and failover to a set of two NFS servers backed by NFS. The = servers themselves contain quite a few disks. 20 of them (7200 RPM SAS = disks), to be exact. (If I didn't lose count again...) Plus two quick = but small SSD's for ZIL and two not-as-quick but larger SSD's for L2ARC. These machines weren't originally designed with synchronous replication = in mind - they were designed to be NFS file servers (used as VMware data = stores) backed by ZFS. They contain LSI MegaRaid 9260 controllers (as an = aside, these were perhaps not the best choice for ZFS since they lack a = true JBOD mode, I have worked around this by making single-disk RAID-0 = arrays, and then using those single-disk arrays to make up the zpool). Now, I've been considering making an active/passive (or, possibly, = active/passive + passive/active) synchronously replicated pair of = servers out of these, and my eyes fall on HAST. Initially, my thoughts land on simply creating HAST resources for the = corresponding pairs of disks and SSDs in servers A and B, and then using = these HAST resources to make up the ZFS pool. But this raises two questions: --- 1. Hardware failure management. In case of a hardware failure, I'm not = exactly sure what will happen, but I suspect the single-disk RAID-0 = array containing the failed disk will simply fail. I assume it will = still exist, but refuse to be read or written. In this situation I = understand HAST will handle this by routing all I/O to the secondary = server, in case the disk on the primary side dies, or simply by cutting = off replication if the disk on the secondary server fails. I have not seen any "hot spare" mechanism in HAST, but I would think = that I could edit the cluster configuration file to manually configure a = hot spare in case I receive an alert. Would I have to restart all of = hastd to do this, though? Or is it sufficient to bring the resource into = init and back into secondary using hastctl? Of course it may just be infinitely simpler just to configure spares on = the ZFS level, and keep entire spare hast resources, and just do a zfs = replace, replacing an entire array of two disks whenever one of the = disks in an array fails. Still, it would be know what I can reconfigure = on-the-fly with hast itself. --- 2. ZFS self-healing. As far as I understand it, ZFS does self-healing, = in that all data is checksummed, and if one disk in a mirror happens to = contain corrupted data, ZFS will re-read the same data from the other = disk in the ZFS mirror. I don't see any way this could work in a = configuration where ZFS is not mirroring itself, but rather, running on = top of HAST, currently. Am I wrong about this? Or is there any way to = achieve this same self-healing effect except with HAST? --- So, what is it, do I have to give up ZFS's self healing (one of the = really neat features in ZFS) if I go for HAST? Of course, I could mirror = the drives first with HAST, and then mirror the two HAST mirrors using a = zfs mirror, but that would be wasteful and a little silly. I might even = be able to get away with using "copies=3D2" in this scenario. Or I could = use raid-z on top of the mirrors, wasting less disk, but causing a = performance hit. I mean, ideally, ZFS would have a really neat synchronous replication = feature built into it. Or ZFS could be HAST-aware, and know how to ask = HAST to bring it a copy of a block of data on the remote block device in = a HAST mirror in case the checksum on the local block device doesn't = match. Or HAST would itself have some kind of block-level checksums, and = do self-healing itself. (This would probably be the easiest to = implement. The secondary site could even continually be reading the same = data as the primary site is, merely to check the checksums on disk, not = to send it over the wire. It's not like it's doing anything else useful = with that untapped read performance.) So, what's the current state of solving this problem? Is there any work = being done in this area? Have I overlooked some technology I might use = to achieve this goal?= From owner-freebsd-fs@FreeBSD.ORG Wed May 18 07:59:48 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D8DCC106564A for ; Wed, 18 May 2011 07:59:48 +0000 (UTC) (envelope-from daniel@digsys.bg) Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230]) by mx1.freebsd.org (Postfix) with ESMTP id 77E498FC12 for ; Wed, 18 May 2011 07:59:48 +0000 (UTC) Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5]) (authenticated bits=0) by smtp-sofia.digsys.bg (8.14.4/8.14.4) with ESMTP id p4I7xbB7038788 (version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO) for ; Wed, 18 May 2011 10:59:42 +0300 (EEST) (envelope-from daniel@digsys.bg) Message-ID: <4DD37C69.5020005@digsys.bg> Date: Wed, 18 May 2011 10:59:37 +0300 From: Daniel Kalchev User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.15) Gecko/20110307 Thunderbird/3.1.9 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> In-Reply-To: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 07:59:48 -0000 On 18.05.11 09:13, Per von Zweigbergk wrote: > I've been investigating HAST as a possibility in adding synchronous replication and failover to a set of two NFS servers backed by NFS. The servers themselves contain quite a few disks. 20 of them (7200 RPM SAS disks), to be exact. (If I didn't lose count again...) Plus two quick but small SSD's for ZIL and two not-as-quick but larger SSD's for L2ARC. Your idea is to have hot standby server, to replace the primary, should the primary fail (hardware-wise)? You need probably CAPR in addition to HAST in order to maintain the same shared IP address. > Initially, my thoughts land on simply creating HAST resources for the corresponding pairs of disks and SSDs in servers A and B, and then using these HAST resources to make up the ZFS pool. This would be the most natural decision, especially if you have identical hardware on both servers. Let's call this variant 1. Variant 2, would be to create local ZFS pools (as you already have) and then create ZVOLs there, that are managed by HAST. Then, you will use the HAST provider for whatever storage needs you have. Performance might not be what you expect and you need to trust HAST for the checksuming. > 1. Hardware failure management. In case of a hardware failure, I'm not exactly sure what will happen, but I suspect the single-disk RAID-0 array containing the failed disk will simply fail. I assume it will still exist, but refuse to be read or written. In this situation I understand HAST will handle this by routing all I/O to the secondary server, in case the disk on the primary side dies, or simply by cutting off replication if the disk on the secondary server fails. Having local ZFS makes hardware management easier, as ZFS is designed for this. This is variant 2. In your case, with variant 1 you will have several issues: - handle the disk failure and array management on the controller level. You need to check if this will work - you may end up with a new array name and thus having to edit config files. - there is no hot spare mechanism in HAST and I do not believe you can switch to secondary easily. Switching to secondary will make the HAST device node disappear for sure on the primary server and reappear on the secondary server. Maybe someone might suggest proper way to handle this. > 2. ZFS self-healing. As far as I understand it, ZFS does self-healing, in that all data is checksummed, and if one disk in a mirror happens to contain corrupted data, ZFS will re-read the same data from the other disk in the ZFS mirror. I don't see any way this could work in a configuration where ZFS is not mirroring itself, but rather, running on top of HAST, currently. Am I wrong about this? Or is there any way to achieve this same self-healing effect except with HAST? HAST is simple mirror. It only makes sure blocks on the local and remove drives contains the same data. I do not believe it has strong enough checksuming to compare with ZFS. Therefore, your best bet is to use ZFS on top of HAST for best data protection. In your example, you will need to create 20 HAST resources, out of each disk. Then create ZFS on top of these HAST resources. ZFS will then be able to heal itself in case there are inconsistencies with data on the HAST resources (for whatever reason). Some reported they used HAST for the SLOG as well. I do not know if using HAST for the L2ARC makes any sense. On failure you will import the pool on the slave node and this will wipe the L2ARC anyway. > I mean, ideally, ZFS would have a really neat synchronous replication feature built into it. Or ZFS could be HAST-aware, and know how to ask HAST to bring it a copy of a block of data on the remote block device in a HAST mirror in case the checksum on the local block device doesn't match. Or HAST would itself have some kind of block-level checksums, and do self-healing itself. (This would probably be the easiest to implement. The secondary site could even continually be reading the same data as the primary site is, merely to check the checksums on disk, not to send it over the wire. It's not like it's doing anything else useful with that untapped read performance.) With HAST, no (hast) storage providers exist on the secondary node. Therefore, you cannot do any I/O on the secondary node, until it becomes primary. I too, would be interested in the failure management scenario with HAST+ZFS, as I am currently experimenting with a similar system. Daniel From owner-freebsd-fs@FreeBSD.ORG Wed May 18 08:37:58 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E3E63106564A for ; Wed, 18 May 2011 08:37:58 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id 7D82F8FC14 for ; Wed, 18 May 2011 08:37:58 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id D48A5C01C6 for ; Wed, 18 May 2011 10:37:56 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Y2dn0jPVpUAT for ; Wed, 18 May 2011 10:37:56 +0200 (CEST) Received: from [10.0.10.11] (unknown [212.112.191.49]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 36805C01C5 for ; Wed, 18 May 2011 10:37:56 +0200 (CEST) Message-ID: <4DD3855E.8020802@itassistans.se> Date: Wed, 18 May 2011 10:37:50 +0200 From: Per von Zweigbergk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: freebsd-fs@freebsd.org References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <4DD37C69.5020005@digsys.bg> In-Reply-To: <4DD37C69.5020005@digsys.bg> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 08:37:59 -0000 On 2011-05-18 09:59, Daniel Kalchev wrote: > Your idea is to have hot standby server, to replace the primary, > should the primary fail (hardware-wise)? > You need probably CAPR in addition to HAST in order to maintain the > same shared IP address. Yes, CARP would be required to handle the actual failover. >> Initially, my thoughts land on simply creating HAST resources for the >> corresponding pairs of disks and SSDs in servers A and B, and then >> using these HAST resources to make up the ZFS pool. > This would be the most natural decision, especially if you have > identical hardware on both servers. Let's call this variant 1. > > Variant 2, would be to create local ZFS pools (as you already have) > and then create ZVOLs there, that are managed by HAST. Then, you will > use the HAST provider for whatever storage needs you have. > Performance might not be what you expect and you need to trust HAST > for the checksuming. This is a really neat idea, and it is going to be a ton easier to configure than anything else. This would mean that you'd be running a stack looking like: - ZFS on top of: - One HAST resource on top of: - Two ZVOLs, each on top of: - ZFS on top of: - Local storage (mirrored by zfs) This still means data will be mirrored twice - stored on 4 HDDs, though, but the configuration will be a ton cleaner than managing a 20-resource HAST configuration monstrosity. It would be an option to run VMFS on top exporting it over ISCSI rather than running ZFS on top exporting it over NFS. I have a feeling that might be less overhead in the end. Although it's less convenient from a management point of view (unless FreeBSD has gained the ability to mount VMFS while I wasn't looking) >> 2. ZFS self-healing. As far as I understand it, ZFS does >> self-healing, in that all data is checksummed, and if one disk in a >> mirror happens to contain corrupted data, ZFS will re-read the same >> data from the other disk in the ZFS mirror. I don't see any way this >> could work in a configuration where ZFS is not mirroring itself, but >> rather, running on top of HAST, currently. Am I wrong about this? Or >> is there any way to achieve this same self-healing effect except with >> HAST? > HAST is simple mirror. It only makes sure blocks on the local and > remove drives contains the same data. I do not believe it has strong > enough checksuming to compare with ZFS. Therefore, your best bet is to > use ZFS on top of HAST for best data protection. Does it actually make sure the blocks on the local and remote drives contain the same data, though? I don't remember reading anything about a cross-check between the two drives in case of data corruption like ZFS does. Although in your described "variant 2" this won't be a problem. > In your example, you will need to create 20 HAST resources, out of > each disk. Then create ZFS on top of these HAST resources. ZFS will > then be able to heal itself in case there are inconsistencies with > data on the HAST resources (for whatever reason). > > Some reported they used HAST for the SLOG as well. I do not know if > using HAST for the L2ARC makes any sense. On failure you will import > the pool on the slave node and this will wipe the L2ARC anyway. Yes, running HAST on L2ARC doesn't make much sense, I'd have to run HAST on the ZIL though if I opted for Variant 1 (which I don't think I will). >> I mean, ideally, ZFS would have a really neat synchronous replication >> feature built into it. Or ZFS could be HAST-aware, and know how to >> ask HAST to bring it a copy of a block of data on the remote block >> device in a HAST mirror in case the checksum on the local block >> device doesn't match. Or HAST would itself have some kind of >> block-level checksums, and do self-healing itself. (This would >> probably be the easiest to implement. The secondary site could even >> continually be reading the same data as the primary site is, merely >> to check the checksums on disk, not to send it over the wire. It's >> not like it's doing anything else useful with that untapped read >> performance.) > With HAST, no (hast) storage providers exist on the secondary node. > Therefore, you cannot do any I/O on the secondary node, until it > becomes primary. I did not mean accessing any of the storage on the secondary node itself, I meant accessing the blocks *as stored on the secondary node* on the primary node. HAST will already do this in case of a read error on the primary node. From owner-freebsd-fs@FreeBSD.ORG Wed May 18 08:53:13 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5E25E106566B for ; Wed, 18 May 2011 08:53:13 +0000 (UTC) (envelope-from pluknet@gmail.com) Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com [209.85.216.54]) by mx1.freebsd.org (Postfix) with ESMTP id 157EB8FC08 for ; Wed, 18 May 2011 08:53:12 +0000 (UTC) Received: by qwc9 with SMTP id 9so909422qwc.13 for ; Wed, 18 May 2011 01:53:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=EHXkU5MwRxgzAsiJfoNRUuzn5b0xlVphlzMGPJoqd1M=; b=vfuITiamL5oRe7jF94bkDeleoqelSabVAGnEQZY6uAVzcus8B8lXSp2LfdwOLoGVDi N0ku/nUOECmKxRo8M2hMNA7ZnMPjWPQUSSiWgvyjyQ1su2IfIWLINYWj+pGY2mx/t5Kq r+2rju2Iom0g8A6GNfZLavuYxX9JHaLzPoVRw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=mxmLYf4hiR1OpP7Ui09EYJQgJM9Q2wEQLYDtXxoCb8RXYELs+A1RudA6G3kWxwmJfp fTnpgzFa51d8S56hCUR3+bAXO76rVDFNVR2RWsJdPK0AJVYynBTA1rRBrDTSnowPd9Rg dUH8TfPjILkBH0yiq0jVBcFSw2FFCRqRZb2o8= MIME-Version: 1.0 Received: by 10.229.67.142 with SMTP id r14mr1205257qci.209.1305708792220; Wed, 18 May 2011 01:53:12 -0700 (PDT) Received: by 10.229.111.218 with HTTP; Wed, 18 May 2011 01:53:12 -0700 (PDT) In-Reply-To: <713535812.490291.1305679037413.JavaMail.root@erie.cs.uoguelph.ca> References: <713535812.490291.1305679037413.JavaMail.root@erie.cs.uoguelph.ca> Date: Wed, 18 May 2011 12:53:12 +0400 Message-ID: From: Sergey Kandaurov To: Rick Macklem Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs@freebsd.org Subject: Re: [old nfsclient] different nmount() args passed from mount vs. mount_nfs X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 08:53:13 -0000 On 18 May 2011 04:37, Rick Macklem wrote: >> Hi. >> >> First, sorry for the long mail. I just tried to describe in full >> details. >> >> When mounting nfs with some options, I found that /sbin/mount and >> /sbin/mount_nfs pass options to nmount() differently, which results >> in bad things (TM). I traced the options and here they are: >> >> From mount(8) -> mount_nfs(8): >> "rw" -> "" >> "addr" -> {something valid } >> "fh" -> 5 >> "sec" -> "sys" >> "nfsv3" -> 0x0 => NFSMNT_NFSV3 >> "hostname" -> "dev2.mail:/home/svn/freebsd/head" >> "fstype" -> "oldnfs" >> "fspath" -> "/usr/src" >> "errmsg" -> "" >> (nil) >> >> From pre-r221124 mount(8): >> = "fstype" -> "oldnfs" >> "hostname" -> "dev2.mail" >> = "fspath" -> "/usr/src" >> "from" -> "dev2.mail:/home/svn/freebsd/head" >> = "errmsg" -> "" >> (nil) >> >> Note, that pre-r221124 mount(8) knows nothing about oldnfs. >> >> 1. "hostname" option is passed differently from mount(8) and >> mount_nfs(8). >> When I force to mount oldnfs file system with mount(8) directly (to >> not >> bypass the nmount(2) call to mount_nfs(8)), I get this error: >> ./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src >> mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid >> argument >> >> Hmm.. this may be because mount(8) passes value in $hostname:$path >> format >> (see the traces above). It might be due to different old nfsclient way >> to parse >> args, but I am not sure, I can be wrong. Anyway, it does not matter >> now. >> >> The actual problem manifests when running the command with pre-r221124 >> mount(8) binary. It knows nothing about "oldnfs" and (attention!) >> calls nmount(2) >> directly instead of bypassing the call to the mount_nfs(8) binary as >> usually done, >> and this is the place where the "unsanitized nmount(2) args" problem >> is hidden. >> [New mount knows about "oldnfs" and passes the call to mount_oldnfs(8) >> that >> prepares all the nmount(2) args to correctly hide the problem.] >> >> To prove it, that is how old and new mount(8) work differently: >> 1) new mount(8) as of current >> mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src >> exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src >> 2) old mount(8) as of pre-r221124 >> ./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src >> mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src >> >> >> Ok, back to the first paragraph: a different "hostname" mount option. >> When I first faced with this, I tried to specify value for "hostname" >> explicitly. Here it comes: >> ./mount -t oldnfs -o hostname=dev2.mail >> dev2.mail:/home/svn/freebsd/head /usr/src >> [CABOOM!] >> It just crashed. Do not do this :) >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x1 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff805da299 >> stack pointer = 0x28:0xffffff807bef6240 >> frame pointer = 0x28:0xffffff807bef62a0 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = interrupt enabled, resume, IOPL = 0 >> current process = 2541 (mount) >> db> bt >> Tracing pid 2541 tid 100076 td 0xfffffe0001ace460 >> nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79 >> nfs_request() at 0xffffffff805da978 = nfs_request+0x398 >> nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc >> VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3 >> mountnfs() at 0xffffffff805de739 = mountnfs+0x329 >> nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7 >> vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f >> nmount() at 0xffffffff804d54f3 = nmount+0x63 >> syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb >> syscall() at 0xffffffff806ae710 = syscall+0x60 >> Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd >> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp = >> 0x7fffffffca48, rbp = 0x801009058 --- >> >> >> As you might see from above nmount(2) args traces, mount(8) itself >> doesn't >> pass the "addr" option to the nmount(2) syscall while nfs_mount() >> expects to >> receive it, which is the problem. >> Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to >> dereference >> addr value and page faults here in nfs_connect() : >> >> vers = NFS_VER3; >> else if (nmp->nm_flag & NFSMNT_NFSV4) >> vers = NFS_VER4; >> XXX saddr is NULL, the next line will crash >> if (saddr->sa_family == AF_INET) >> if (nmp->nm_sotype == SOCK_DGRAM) >> nconf = getnetconfigent("udp"); >> >> I think that nfsclient, probably in >> sys/nfsclient/nfs_vfsops.c:mount_nfs(), >> should handle a missing value for "addr" and/or "fh" mount options. >> It doesn't check it currently: >> > Yes, at least for the case of "addr". I'm not sure if a zero length > fh is considered ok for the old client or not. (It is valid for the > new one.) > > I've attached a patch that does the check for the "addr=" option for > both clients. You can test that if you'd like. It should avoid the > crash. Thank you very much. After patch applied, at least old nfsclient works as expected. (I didn't test new nfsclient). ./mount -t oldnfs -o hostname=dev2.mail dev2.mail:/home/svn/freebsd/head /usr/src mount: dev2.mail:/home/svn/freebsd/head No server address: Invalid argument Can you commit the patch? > > Since "oldnfs" didn't exist as a file system type pre-r21124, I don't > think you can expect a pre-r211124 mount to be able to be done for it. I see. My only concern was a crash. > (It will work for the default "nfs", it will just use the new NFS client.) > >> % static int >> % nfs_mount(struct mount *mp) >> % { >> % struct nfs_args args = { >> % [...] >> % .addr = NULL, >> % }; >> % int error, ret, has_nfs_args_opt; >> % int has_addr_opt, has_fh_opt, has_hostname_opt; >> % struct sockaddr *nam; >> >> addr is initialized with NULL. num used later as a pointer to >> args.addr value. >> >> % if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) { >> % error = nfs_mountroot(mp); >> % goto out; >> % } >> >> We do not try to mount root, this is not ours. >> >> % if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) { >> [...] >> % has_nfs_args_opt = 1; >> % } >> >> We do not use old mount(2) interface, not ours. >> >> % if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0) >> % args.flags |= NFSMNT_NFSV3; >> >> mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set. >> >> % if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr, >> % &args.addrlen) == 0) { >> % has_addr_opt = 1; >> % if (args.addrlen > SOCK_MAXADDRLEN) { >> % error = ENAMETOOLONG; >> % goto out; >> % } >> % nam = malloc(args.addrlen, M_SONAME, >> % M_WAITOK); >> % bcopy(args.addr, nam, args.addrlen); >> % nam->sa_len = args.addrlen; >> % } >> >> mount(8) doesn't pass addr option, so args.addr isn't set, hence >> struct sockaddr *nam is also NULL, has_addr_opt is 0. >> >> % if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname, >> % NULL) == 0) { >> % has_hostname_opt = 1; >> % } >> % if (args.hostname == NULL) { >> % vfs_mount_error(mp, "Invalid hostname"); >> % error = EINVAL; >> % goto out; >> % } >> >> I don't know why I got here the error. I didn't analyze it deep >> though. >> "mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid >> argument" > > You'll get this if there is no hostname="xxx" argument specified, which > I believe is correct. Yes, that's true. mount(8) doesn't specify a "hostname" option itself. -- wbr, pluknet From owner-freebsd-fs@FreeBSD.ORG Wed May 18 20:37:40 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CEF02106566B for ; Wed, 18 May 2011 20:37:40 +0000 (UTC) (envelope-from rmacklem@uoguelph.ca) Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca [131.104.91.36]) by mx1.freebsd.org (Postfix) with ESMTP id 911C98FC14 for ; Wed, 18 May 2011 20:37:40 +0000 (UTC) X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: ApwEAM8s1E2DaFvO/2dsb2JhbACEWaI6iHCtB5B9gSuBbIF7gQcEkBGHK4dm X-IronPort-AV: E=Sophos;i="4.65,233,1304308800"; d="scan'208";a="121148675" Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca) ([131.104.91.206]) by esa-annu-pri.mail.uoguelph.ca with ESMTP; 18 May 2011 16:37:39 -0400 Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1]) by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 79361B3F53; Wed, 18 May 2011 16:37:39 -0400 (EDT) Date: Wed, 18 May 2011 16:37:39 -0400 (EDT) From: Rick Macklem To: FreeBSD FS Message-ID: <5718691.545130.1305751059426.JavaMail.root@erie.cs.uoguelph.ca> In-Reply-To: <20110517092011.GK48734@deviant.kiev.zoral.com.ua> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [172.17.91.201] X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692) Cc: Subject: Re: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 20:37:40 -0000 > Yes, the flag to specify the locking mode does only specify the > minimal > locking requirements, and filesystem is allowed to upgrade it to the > more strict lock type. E.g. UFS would only return shared lock if the > vnode was found in hash, AFAIR. If not told otherwise, getnewvnode(9) > forces lockmgr to convert all lock requests into exclusive. > That's exactly what UFS does, but I did notice some inconsistencies w.r.t. the various file systems. For VFS_VGET(), ffs/cd9660/udf do basically the following: 1 error = vfs_hash_get(mp, ino, flags, curthread, vpp, NULL, NULL); ... 2 if ((flags & LK_TYPE_MASK) == LK_SHARED) { flags &= ~LK_TYPE_MASK; flags |= LK_EXCLUSIVE; } ... 3 lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL); ... 4 error = vfs_hash_insert(vp, ino, flags, curthread, vpp, NULL, NULL); but hpfs/ext2fs do something similar to the above, except they omit step #2. (ie. They would do #4 with LK_SHARED, if that was what flags is passed in as.) Looking at vfs_hash_insert(), the "flags" argument is just used for vget(), so it isn't obvious to me if it needs to be LK_EXCLUSIVE or not. So, does anyone know if this depend on the file system or are hpfs/ext2fs broken? Thanks in advance for any help with this, rick ps: Fortunately, for my patch, I can just ignore the "flags" argument for VFS_FHTOVP() for the file systems I'm not sure about, so they'll just return LK_EXCLUSIVE locked vnodes. From owner-freebsd-fs@FreeBSD.ORG Wed May 18 23:24:30 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 93F00106566B for ; Wed, 18 May 2011 23:24:30 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200]) by mx1.freebsd.org (Postfix) with ESMTP id E5F228FC14 for ; Wed, 18 May 2011 23:24:29 +0000 (UTC) Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p4INOQ19046011 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 19 May 2011 02:24:26 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id p4INOPhW004185; Thu, 19 May 2011 02:24:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p4INOPUw004184; Thu, 19 May 2011 02:24:25 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Thu, 19 May 2011 02:24:25 +0300 From: Kostik Belousov To: Rick Macklem Message-ID: <20110518232425.GX48734@deviant.kiev.zoral.com.ua> References: <20110517092011.GK48734@deviant.kiev.zoral.com.ua> <5718691.545130.1305751059426.JavaMail.root@erie.cs.uoguelph.ca> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="P33LUqzLXAslwFyJ" Content-Disposition: inline In-Reply-To: <5718691.545130.1305751059426.JavaMail.root@erie.cs.uoguelph.ca> User-Agent: Mutt/1.4.2.3i X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00, DNS_FROM_OPENWHOIS autolearn=no version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on skuns.kiev.zoral.com.ua Cc: FreeBSD FS Subject: Re: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 18 May 2011 23:24:30 -0000 --P33LUqzLXAslwFyJ Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 18, 2011 at 04:37:39PM -0400, Rick Macklem wrote: > > Yes, the flag to specify the locking mode does only specify the > > minimal > > locking requirements, and filesystem is allowed to upgrade it to the > > more strict lock type. E.g. UFS would only return shared lock if the > > vnode was found in hash, AFAIR. If not told otherwise, getnewvnode(9) > > forces lockmgr to convert all lock requests into exclusive. > >=20 > That's exactly what UFS does, but I did notice some inconsistencies > w.r.t. the various file systems. >=20 > For VFS_VGET(), ffs/cd9660/udf do basically the following: > 1 error =3D vfs_hash_get(mp, ino, flags, curthread, vpp, NULL, NULL); > ... > 2 if ((flags & LK_TYPE_MASK) =3D=3D LK_SHARED) { > flags &=3D ~LK_TYPE_MASK; > flags |=3D LK_EXCLUSIVE; > } > ... > 3 lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL); > ... > 4 error =3D vfs_hash_insert(vp, ino, flags, curthread, vpp, NULL, NULL); >=20 > but hpfs/ext2fs do something similar to the above, except > they omit step #2. (ie. They would do #4 with LK_SHARED, if > that was what flags is passed in as.) >=20 > Looking at vfs_hash_insert(), the "flags" argument is just > used for vget(), so it isn't obvious to me if it needs to > be LK_EXCLUSIVE or not. I would say that what ext2fs and hpfs trying to do is legitimate, since the caller expects to get only the lock specified in the flags. But, in fact, all locks for ext2fs and hpfs are exclusive, since as I said in the previous message, getnewvnode() initializes vnode lock for automatic converstion shared->exclusive, and ext2fs/hpfs do not override this. >=20 > So, does anyone know if this depend on the file system or are hpfs/ext2fs > broken? >=20 > Thanks in advance for any help with this, rick > ps: Fortunately, for my patch, I can just ignore the "flags" > argument for VFS_FHTOVP() for the file systems I'm not > sure about, so they'll just return LK_EXCLUSIVE locked > vnodes. >=20 --P33LUqzLXAslwFyJ Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (FreeBSD) iEYEARECAAYFAk3UVSkACgkQC3+MBN1Mb4j9PgCgpdZeYsOjTmCr7j9Bj87nTtKl /aAAoOdggCkJAm/feMdoMhOwIfifOefi =yn0L -----END PGP SIGNATURE----- --P33LUqzLXAslwFyJ-- From owner-freebsd-fs@FreeBSD.ORG Thu May 19 01:09:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F20F8106566B for ; Thu, 19 May 2011 01:09:52 +0000 (UTC) (envelope-from zack.kirsch@isilon.com) Received: from seaxch10.isilon.com (seaxch10.isilon.com [74.85.160.26]) by mx1.freebsd.org (Postfix) with ESMTP id D67D08FC12 for ; Thu, 19 May 2011 01:09:52 +0000 (UTC) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Date: Wed, 18 May 2011 18:09:50 -0700 Message-ID: <476FC2247D6C7843A4814ED64344560C03EC9A5E@seaxch10.desktop.isilon.com> In-Reply-To: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9 Thread-Index: AcwUJVR+y7oTlZiSQ+2b21kpzuDndwBm+p/g References: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca> From: "Zack Kirsch" To: "Rick Macklem" , "FreeBSD FS" Cc: Subject: RE: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 01:09:53 -0000 QnR3LCB3ZSd2ZSBpbXBsZW1lbnRlZCBleGFjdGx5IHRoaXMgYXQgSXNpbG9uIGFuZCBkbyB0YWtl IFNIQVJFRCBsb2NrcyBpbnN0ZWFkIG9mIEVYQ0xVU0lWRSBmb3IgbWFueSBvcGVyYXRpb25zLiBJ J20gZGVmaW5pdGVseSBpbiBzdXBwb3J0IG9mIHRoZSBpZGVhLg0KIA0KWmFjaw0KDQotLS0tLU9y aWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogb3duZXItZnJlZWJzZC1mc0BmcmVlYnNkLm9yZyBb bWFpbHRvOm93bmVyLWZyZWVic2QtZnNAZnJlZWJzZC5vcmddIE9uIEJlaGFsZiBPZiBSaWNrIE1h Y2tsZW0NClNlbnQ6IE1vbmRheSwgTWF5IDE2LCAyMDExIDQ6NTkgUE0NClRvOiBGcmVlQlNEIEZT DQpTdWJqZWN0OiBSRkM6IGFkZGluZyBhIGxvY2sgZmxhZ3MgYXJndW1lbnQgdG8gVkZTX0ZIVE9W UCgpIGZvciBGcmVlQlNEOQ0KDQpIaSwNCg0KRG93biB0aGUgcm9hZCwgSSB3b3VsZCBsaWtlIHRo ZSBORlMgc2VydmVyIHRvIGJlIGFibGUgdG8gZG8gYQ0KICBWRlNfRkhUT1ZQKG1wLCAmZmhwLT5m aF9maWQsIExLX1NIQVJFRCwgdnBwKTsNCnNpbWlsYXIgdG8gd2hhdCBpcyBhbHJlYWR5IHN1cHBv cnRlZCBmb3IgVkZTX1ZHRVQoKS4gVGhlIHJlYXNvbg0KaXMgdGhhdCwgY3VycmVudGx5LCB3aGVu IGEgY2xpZW50IGRvZXMgcmVhZC1haGVhZHMsIHRoZXNlIHJlYWRzIGFyZQ0KYmFzaWNhbGx5IHNl cmlhbGl6ZWQgYmVjYXVzZSB0aGUgVkZTX0ZIVE9WUCgpIGdldHMgYW4gTEtfRVhDTFVTSVZFDQps b2NrZWQgdm5vZGUgZm9yIGVhY2ggUlBDIG9uIHRoZSBzZXJ2ZXIuDQoNCkxpa2UgVkZTX1ZHRVQo KSwgdGhlIHVuZGVybHlpbmcgZmlsZSBzeXN0ZW0gY2FuIHN0aWxsIGNob29zZSB0bw0KcmV0dXJu IGEgTEtfRVhDTFVTSVZFIGxvY2tlZCB2bm9kZSBldmVuIHdoZW4gTEtfU0hBUkVEIGlzIHNwZWNp ZmllZC4NCihTb21lIGZpbGUgc3lzdGVtcywgc3VjaCBhcyBGRlMsIGp1c3QgY2FsbCBWRlNfVkdF VCgpIGluIFZGU19GSFRPVlAoKSwNCiBzbyBhbGwgdGhhdCBoYXBwZW5zIGlzIHRoYXQgdGhlIGZs YWcgaXMgcGFzc2VkIHRocm91Z2ggdG8gVkZTX1ZHRVQoKQ0KIGZvciB0aG9zZSBvbmVzLikNCg0K VG8gbWluaW1pemUgdGhlIHJpc2sgb2YgdGhlIHBhdGNoIGJyZWFraW5nIHNvbWV0aGluZywgSSBo YXZlIGl0IHNldHRpbmcNCkxLX0VYQ0xVU0lWRSBmb3IgYWxsIFZGU19GSFRPVlAoKSBjYWxscyBz byB0aGF0IHRoZSBzZW1hbnRpY3MgZG9uJ3QNCmFjdHVhbGx5IGNoYW5nZS4gKENoYW5naW5nIHRo ZSBORlMgc2VydmVyIHRvIHVzZSBMS19TSEFSRUQgaXMgYSB0cml2aWFsDQpwYXRjaCwgYnV0IHdp bGwgbmVlZCBleHRlbnNpdmUgdGVzdGluZywgc28gSSdtIG5vdCBwbGFubmluZyBvbiB0aGF0DQpj aGFuZ2UgZm9yIDkuMC4pDQoNCklmIHlvdSBhcmUgaW50ZXJlc3RlZCwgbXkgY3VycmVudCBwYXRj aCBpcyBhdDoNCiAgaHR0cDovL3Blb3BsZS5mcmVlYnNkLm9yZy9+cm1hY2tsZW0vZmh0b3ZwLnBh dGNoDQoNClNvLCBkb2VzIHRoaXMgc291bmQgbGlrZSBhIHJlYXNvbmFibGUgdGhpbmcgdG8gY29t bWl0LCBvbmNlIHRoZSBwYXRjaA0KaXMgcmV2aWV3ZWQ/DQoNCnJpY2sNCl9fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQpmcmVlYnNkLWZzQGZyZWVic2Qub3Jn IG1haWxpbmcgbGlzdA0KaHR0cDovL2xpc3RzLmZyZWVic2Qub3JnL21haWxtYW4vbGlzdGluZm8v ZnJlZWJzZC1mcw0KVG8gdW5zdWJzY3JpYmUsIHNlbmQgYW55IG1haWwgdG8gImZyZWVic2QtZnMt dW5zdWJzY3JpYmVAZnJlZWJzZC5vcmciDQo= From owner-freebsd-fs@FreeBSD.ORG Thu May 19 08:49:50 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C6559106566C for ; Thu, 19 May 2011 08:49:50 +0000 (UTC) (envelope-from grarpamp@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 999028FC14 for ; Thu, 19 May 2011 08:49:50 +0000 (UTC) Received: by pwj8 with SMTP id 8so1467229pwj.13 for ; Thu, 19 May 2011 01:49:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc :content-type; bh=W90XvgAHYqw9Plm0gVBUGFKVuA5et0IbX7ACZcYRPbo=; b=QbH3uZsx5swpINGN4UCeslcNQ52gyogkIzbhedHSKmkY69wrYDUtZ7Q/eOAaJHiVCe C0zGp87VhdLXaR7qY1G4BQ+ou8q69wNtY+fNCyDjiw+RVhHR5fNgwtmLe8t2CDnMzyd9 Zb5v7fQ3qgJOEQNMYzdsBIJEodmysNbburxdI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:cc:content-type; b=HWQz9izMPdql1M+ptryHLsY7JjL81FQFFkjitAEuGdiadLYinoXvuH5TBATgq1Dvt9 MHvSrxowvtGVq0cndWMtFs/cBrifc2uJyhrQNQy1fEdoG0qZ7yoh8qRjaXV9+aDLtN+d C9C1eBGK9GJETscnLXXJscToMIwMz3zRBY/wY= MIME-Version: 1.0 Received: by 10.142.121.41 with SMTP id t41mr1641681wfc.358.1305779762948; Wed, 18 May 2011 21:36:02 -0700 (PDT) Received: by 10.142.157.2 with HTTP; Wed, 18 May 2011 21:36:02 -0700 (PDT) Date: Thu, 19 May 2011 00:36:02 -0400 Message-ID: From: grarpamp To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Cc: freebsd-questions@freebsd.org Subject: UDF and DVD's X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 08:49:50 -0000 Greetings... :) The first filesystem DVD... other than a movie DVD (DVD-VIDEO?), and the FreeBSD make release DVD's (iso9660)... that I've ever tried to mount, well... don't. It is: Windows 7 Ultimate with Service Pack 1 (x64) - DVD (English) 5/12/2011 You can find the SHA-1 hash here: http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx and a sample image, if needed for reference purposes, via any search engine. Anyways, after a little reasearch, does FreeBSD not, in fact, support this UDF version? (I don't yet know how to supply the version of this image for you?) Can the FreeBSD team implement it? Perhaps by porting from NetBSD 5.1's seemingly near complete implementation? http://en.wikipedia.org/wiki/Universal_Disk_Format http://www.osta.org/specs/index.htm As perhaps even a GSOC or Foundation project? Because reading retail optical filesystem formats would seem to be a rather expected capability? I'm guessing the current state within FreeBSD means that I can neither read, nor create, or write, readable (compatible) images at this, or any given, UDF level? As I've no other DVD's to test with... what UDF versions are most DVD data ROM's published in? Is this a blocker for FreeBSD? For me, at least, minimally, that seems to be the case... as I now have no way to rip, mount and add the files to this DVD that I would like to add. Except to use Windows, which I consider to be unreliable at best. Thoughts? Thanks :) From owner-freebsd-fs@FreeBSD.ORG Thu May 19 09:14:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 15D2E10656AC for ; Thu, 19 May 2011 09:14:36 +0000 (UTC) (envelope-from jdc@koitsu.dyndns.org) Received: from qmta10.emeryville.ca.mail.comcast.net (qmta10.emeryville.ca.mail.comcast.net [76.96.30.17]) by mx1.freebsd.org (Postfix) with ESMTP id F09508FC13 for ; Thu, 19 May 2011 09:14:35 +0000 (UTC) Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90]) by qmta10.emeryville.ca.mail.comcast.net with comcast id lMBN1g0011wfjNsAAMEaAB; Thu, 19 May 2011 09:14:34 +0000 Received: from koitsu.dyndns.org ([67.180.84.87]) by omta23.emeryville.ca.mail.comcast.net with comcast id lMEZ1g00V1t3BNj8jMEagm; Thu, 19 May 2011 09:14:34 +0000 Received: by icarus.home.lan (Postfix, from userid 1000) id 6FCEA102C19; Thu, 19 May 2011 02:14:33 -0700 (PDT) Date: Thu, 19 May 2011 02:14:33 -0700 From: Jeremy Chadwick To: grarpamp Message-ID: <20110519091433.GA94053@icarus.home.lan> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: UDF and DVD's X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 09:14:36 -0000 On Thu, May 19, 2011 at 12:36:02AM -0400, grarpamp wrote: > Greetings... :) > > The first filesystem DVD... other than a movie DVD (DVD-VIDEO?), > and the FreeBSD make release DVD's (iso9660)... that I've ever tried > to mount, well... don't. It is: > Windows 7 Ultimate with Service Pack 1 (x64) - DVD (English) 5/12/2011 > You can find the SHA-1 hash here: > http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx > and a sample image, if needed for reference purposes, via any search > engine. > > Anyways, after a little reasearch, does FreeBSD not, in fact, support > this UDF version? (I don't yet know how to supply the version of > this image for you?) > > Can the FreeBSD team implement it? Perhaps by porting from NetBSD > 5.1's seemingly near complete implementation? > http://en.wikipedia.org/wiki/Universal_Disk_Format > http://www.osta.org/specs/index.htm > As perhaps even a GSOC or Foundation project? Because reading retail > optical filesystem formats would seem to be a rather expected > capability? > > I'm guessing the current state within FreeBSD means that I can > neither read, nor create, or write, readable (compatible) images > at this, or any given, UDF level? > > As I've no other DVD's to test with... what UDF versions are most > DVD data ROM's published in? > > Is this a blocker for FreeBSD? > > For me, at least, minimally, that seems to be the case... as I now > have no way to rip, mount and add the files to this DVD that I would > like to add. Except to use Windows, which I consider to be unreliable > at best. > > Thoughts? Thanks :) Thoughts: please provide commands, full output, etc. that show how you're trying to mount the disc, as well as relevant /dev entries pertaining to your DVD drive. dmesg might also be helpful. And I assume you have looked at mount_udf(8)? -- | Jeremy Chadwick jdc@parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP 4BD6C0CB | From owner-freebsd-fs@FreeBSD.ORG Thu May 19 09:53:33 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 1233) id 0895C106566B; Thu, 19 May 2011 09:53:33 +0000 (UTC) Date: Thu, 19 May 2011 09:53:33 +0000 From: Alexander Best To: Jeremy Chadwick Message-ID: <20110519095333.GA43066@freebsd.org> References: <20110519091433.GA94053@icarus.home.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110519091433.GA94053@icarus.home.lan> Cc: freebsd-fs@freebsd.org, grarpamp , freebsd-questions@freebsd.org Subject: Re: UDF and DVD's X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 09:53:33 -0000 On Thu May 19 11, Jeremy Chadwick wrote: > On Thu, May 19, 2011 at 12:36:02AM -0400, grarpamp wrote: > > Greetings... :) > > > > The first filesystem DVD... other than a movie DVD (DVD-VIDEO?), > > and the FreeBSD make release DVD's (iso9660)... that I've ever tried > > to mount, well... don't. It is: > > Windows 7 Ultimate with Service Pack 1 (x64) - DVD (English) 5/12/2011 > > You can find the SHA-1 hash here: > > http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx > > and a sample image, if needed for reference purposes, via any search > > engine. > > > > Anyways, after a little reasearch, does FreeBSD not, in fact, support > > this UDF version? (I don't yet know how to supply the version of > > this image for you?) > > > > Can the FreeBSD team implement it? Perhaps by porting from NetBSD > > 5.1's seemingly near complete implementation? > > http://en.wikipedia.org/wiki/Universal_Disk_Format > > http://www.osta.org/specs/index.htm > > As perhaps even a GSOC or Foundation project? Because reading retail > > optical filesystem formats would seem to be a rather expected > > capability? > > > > I'm guessing the current state within FreeBSD means that I can > > neither read, nor create, or write, readable (compatible) images > > at this, or any given, UDF level? > > > > As I've no other DVD's to test with... what UDF versions are most > > DVD data ROM's published in? > > > > Is this a blocker for FreeBSD? > > > > For me, at least, minimally, that seems to be the case... as I now > > have no way to rip, mount and add the files to this DVD that I would > > like to add. Except to use Windows, which I consider to be unreliable > > at best. > > > > Thoughts? Thanks :) freebsd as of now has two problems: 1) it only supports UDF 1.x and *not* UDF 2.x. 2) it does not properly support iso9660 with files > 4gb via multiple extents. whenever you mount such a dvd, you see each 4gb file twice. cheers. alex ps: for 2) see http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/95222 > > Thoughts: please provide commands, full output, etc. that show how > you're trying to mount the disc, as well as relevant /dev entries > pertaining to your DVD drive. dmesg might also be helpful. And I > assume you have looked at mount_udf(8)? > > -- > | Jeremy Chadwick jdc@parodius.com | > | Parodius Networking http://www.parodius.com/ | > | UNIX Systems Administrator Mountain View, CA, USA | > | Making life hard for others since 1977. PGP 4BD6C0CB | > -- a13x From owner-freebsd-fs@FreeBSD.ORG Thu May 19 14:55:48 2011 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 44E76106564A; Thu, 19 May 2011 14:55:48 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 1C2868FC0A; Thu, 19 May 2011 14:55:48 +0000 (UTC) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p4JEtlVB074177; Thu, 19 May 2011 14:55:47 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p4JEtlvA074173; Thu, 19 May 2011 14:55:47 GMT (envelope-from linimon) Date: Thu, 19 May 2011 14:55:47 GMT Message-Id: <201105191455.p4JEtlvA074173@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/157179: [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remove_ref(db->db_buf, db) == 0 X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 14:55:48 -0000 Old Synopsis: zfs/dbuf.c: panic: solaris assert: arc_buf_remove_ref(db->db_buf, db) == 0 New Synopsis: [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remove_ref(db->db_buf, db) == 0 Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Thu May 19 14:55:35 UTC 2011 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=157179 From owner-freebsd-fs@FreeBSD.ORG Thu May 19 14:55:55 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4EE2C106564A; Thu, 19 May 2011 14:55:55 +0000 (UTC) (envelope-from jwd@SlowBlink.Com) Received: from nmail.slowblink.com (rrcs-24-199-145-34.midsouth.biz.rr.com [24.199.145.34]) by mx1.freebsd.org (Postfix) with ESMTP id 10FC08FC14; Thu, 19 May 2011 14:55:54 +0000 (UTC) Received: from nmail.slowblink.com (localhost [127.0.0.1]) by nmail.slowblink.com (8.14.3/8.14.3) with ESMTP id p4JEdZEQ083204; Thu, 19 May 2011 10:39:35 -0400 (EDT) (envelope-from jwd@nmail.slowblink.com) Received: (from jwd@localhost) by nmail.slowblink.com (8.14.3/8.14.3/Submit) id p4JEdZmd083203; Thu, 19 May 2011 10:39:35 -0400 (EDT) (envelope-from jwd) Date: Thu, 19 May 2011 10:39:35 -0400 From: John D To: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org Message-ID: <20110519143935.GA83122@slowblink.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: LSI 9200-8e/gmultipath/ZFS cable pull kernel crash X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 14:55:55 -0000 Hi Folks, Looking for a bit of help to debug a sas interconnect, gmultipath, and zfs filesystem crash when the 2nd cable is pulled. Apologies for the cross post to geom & fs, hoped I would catch the right folks. In general, I have two systems each with an LSI 9200-8e sas hba installed. Each adapter has two cables going to shelf 1, then to shelf 2, then to the second system. System1 <---> Shelf1 <---> Shelf2 <---> System2 System1 <---> Shelf1 <---> Shelf2 <---> System2 Typical stuff, system1 & 2 are carp'd together, if one system goes down, the second zfs imports the pools and takes over. If I do a test pull of one of the cables, multipath removes the failed providers correctly with no interupt to the filesystem. Reinstalling the cable causes the providers to be re-integrated. This cable can be pull/reinstalled multiple times with no problem. However, pulling the 2nd causes a kernel crash. I have the configuration/logs/screen shots here: http://people.freebsd.org/~jwd/lsi_gmultipath_zfs.html I can replicate the problem on demand. Any help debugging this problem is appreciated. Thanks! john From owner-freebsd-fs@FreeBSD.ORG Thu May 19 16:56:56 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 03139106564A for ; Thu, 19 May 2011 16:56:56 +0000 (UTC) (envelope-from piotr.kucharski@42.pl) Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com [209.85.215.54]) by mx1.freebsd.org (Postfix) with ESMTP id 8BBBB8FC14 for ; Thu, 19 May 2011 16:56:55 +0000 (UTC) Received: by ewy1 with SMTP id 1so1284501ewy.13 for ; Thu, 19 May 2011 09:56:54 -0700 (PDT) Received: by 10.204.73.206 with SMTP id r14mr1284954bkj.181.1305822609153; Thu, 19 May 2011 09:30:09 -0700 (PDT) MIME-Version: 1.0 Received: by 10.204.38.137 with HTTP; Thu, 19 May 2011 09:29:29 -0700 (PDT) X-Originating-IP: [224.9.88.219] In-Reply-To: References: From: Piotr Kucharski Date: Thu, 19 May 2011 18:29:29 +0200 Message-ID: To: Adam Vande More Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: very slow zfs scrub X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 16:56:56 -0000 On Thu, Feb 24, 2011 at 21:00, Adam Vande More wrot= e: >> Wow! What does scrub do that it slows ggate drive almost to halt? >> >> What can I do to fix it? > > I think network latency is going to have huge impact on performance here. > Have you tried any ggate or nic tuning?=C2=A0 Would HAST be an option for= you?=C2=A0 I > think it has more performance thought put into it. > Well, the network seems rather idle, host and client share the same 1Gb LAN (not sure if the same switch, though) with <0.2ms rtt for 1k packets in ping. When not scrubbing, sequential reads are satisfactory. I'm inclined to think some read or write pattern of scrub that is causing ggate to suck immensely. :/ From owner-freebsd-fs@FreeBSD.ORG Thu May 19 17:08:08 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFDC61065673 for ; Thu, 19 May 2011 17:08:08 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 645158FC1B for ; Thu, 19 May 2011 17:08:07 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 93FB145C9F; Thu, 19 May 2011 19:08:06 +0200 (CEST) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 5363145683; Thu, 19 May 2011 19:08:01 +0200 (CEST) Date: Thu, 19 May 2011 19:07:40 +0200 From: Pawel Jakub Dawidek To: Per von Zweigbergk Message-ID: <20110519170740.GA2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <4DD37C69.5020005@digsys.bg> <4DD3855E.8020802@itassistans.se> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="yrj/dFKFPuw6o+aM" Content-Disposition: inline In-Reply-To: <4DD3855E.8020802@itassistans.se> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 17:08:08 -0000 --yrj/dFKFPuw6o+aM Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 18, 2011 at 10:37:50AM +0200, Per von Zweigbergk wrote: [...] > This would mean that you'd be running a stack looking like: > - ZFS on top of: > - One HAST resource on top of: > - Two ZVOLs, each on top of: > - ZFS on top of: > - Local storage (mirrored by zfs) Having recursive ZFS pools is bad idea and most likely it was cause deadocks. You also pay all the costs with checksumming, ARC cache, etc. twice. Very bad idea. > >Some reported they used HAST for the SLOG as well. I do not know > >if using HAST for the L2ARC makes any sense. On failure you will > >import the pool on the slave node and this will wipe the L2ARC > >anyway. > Yes, running HAST on L2ARC doesn't make much sense, I'd have to run > HAST on the ZIL though if I opted for Variant 1 (which I don't think > I will). Using HAST for L2ARC devices might make no sense, but they are part of the pool. So if you import the pool on another machine L2ARC device will be failed. You may experiment with importing the pool, removing current L2ARC devices and attaching machine-local L2ARC devices. This way you avoid HAST for L2ARC, but not sure how reliable can that be. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --yrj/dFKFPuw6o+aM Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk3VTlwACgkQForvXbEpPzTwtQCg83O//7AdOSAZDbscZT+WTliT YK0An0DKUe1/1hqtY2ZyjqqzJ5kO6ftD =bJn8 -----END PGP SIGNATURE----- --yrj/dFKFPuw6o+aM-- From owner-freebsd-fs@FreeBSD.ORG Thu May 19 17:16:05 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CA3AB1065673 for ; Thu, 19 May 2011 17:16:05 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 5C98E8FC1A for ; Thu, 19 May 2011 17:16:05 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id BC3214AC1C for ; Thu, 19 May 2011 21:16:03 +0400 (MSD) Date: Thu, 19 May 2011 21:15:59 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1409064431.20110519211559@serebryakov.spb.ru> To: freebsd-fs@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Subject: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 17:16:05 -0000 Hello, Freebsd-fs. I have /usr/home partition on my new server which is 400GiB (only 17GiB is used). It is UFS2, SoftUpdates are enabled. I want to backup it on live system, but 4 times out of 5 I got (after 10-12 minutes of wait! Oh my, 10 minutes to create snapshot!): mksnap_ffs: Cannot create snapshot /usr/home/.snap/dump_snapshot: Resource = temporarily unavailable dump: Cannot create /usr/home/.snap/dump_snapshot: No such file or directory It is FreeBSD 8.2-STABLE/amd64, 8GiB of memory. I've never encounter such problem on previous server, which has about 80GiB (with 20GiB used). --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu May 19 17:18:02 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 44F5D106564A for ; Thu, 19 May 2011 17:18:02 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 035728FC1B for ; Thu, 19 May 2011 17:18:02 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 4306F4AC1C for ; Thu, 19 May 2011 21:17:59 +0400 (MSD) Date: Thu, 19 May 2011 21:17:55 +0400 From: Lev Serebryakov Organization: FreeBSD X-Priority: 3 (Normal) Message-ID: <1606289061.20110519211755@serebryakov.spb.ru> To: freebsd-fs@freebsd.org In-Reply-To: <1409064431.20110519211559@serebryakov.spb.ru> References: <1409064431.20110519211559@serebryakov.spb.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 17:18:02 -0000 Hello, Freebsd-fs. You wrote 19 =EC=E0=FF 2011 =E3., 21:15:59: > I have /usr/home partition on my new server which is 400GiB (only > 17GiB is used). It is UFS2, SoftUpdates are enabled. > I want to backup it on live system, but 4 times out of 5 I got > (after 10-12 minutes of wait! Oh my, 10 minutes to create snapshot!): And server is almost unusable for these 10 minutes. > I've never encounter such problem on previous server, which has > about 80GiB (with 20GiB used). It takes about 30 seconds on this FS... --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Thu May 19 18:15:06 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 432BB106564A for ; Thu, 19 May 2011 18:15:06 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 87FF08FC19 for ; Thu, 19 May 2011 18:15:04 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 30DF845CAC; Thu, 19 May 2011 20:15:03 +0200 (CEST) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 06C9545CDC; Thu, 19 May 2011 20:14:56 +0200 (CEST) Date: Thu, 19 May 2011 20:14:36 +0200 From: Pawel Jakub Dawidek To: Per von Zweigbergk Message-ID: <20110519181436.GB2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="oLBj+sq0vYjzfsbl" Content-Disposition: inline In-Reply-To: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 18:15:06 -0000 --oLBj+sq0vYjzfsbl Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, May 18, 2011 at 08:13:13AM +0200, Per von Zweigbergk wrote: > I've been investigating HAST as a possibility in adding synchronous repli= cation and failover to a set of two NFS servers backed by NFS. The servers = themselves contain quite a few disks. 20 of them (7200 RPM SAS disks), to b= e exact. (If I didn't lose count again...) Plus two quick but small SSD's f= or ZIL and two not-as-quick but larger SSD's for L2ARC. [...] The configuration you should try first is to connect each disks pair using HAST and create ZFS pool on top of those HAST devices. Let's assume you have 4 data disks (da0-da3), 2 SSD disks for ZIL (da4-da5) and 2 SSD disks for L2ARC (da6-da7). Then you create the following HAST devices: /dev/hast/data0 =3D MachineA(da0) + MachineB(da0) /dev/hast/data1 =3D MachineA(da1) + MachineB(da1) /dev/hast/data2 =3D MachineA(da2) + MachineB(da2) /dev/hast/data3 =3D MachineA(da3) + MachineB(da3) /dev/hast/slog0 =3D MachineA(da4) + MachineB(da4) /dev/hast/slog1 =3D MachineA(da5) + MachineB(da5) /dev/hast/cache0 =3D MachineA(da6) + MachineB(da6) /dev/hast/cache1 =3D MachineA(da7) + MachineB(da7) And then you create ZFS pool of your choice. Here you specify redundancy, so if there is any you will have ZFS self-healing: zpool create tank raidz1 hast/data{0,1,2,3} log mirror hast/slog{0,1} cache= hast/cache{0,1} > 1. Hardware failure management. In case of a hardware failure, I'm not ex= actly sure what will happen, but I suspect the single-disk RAID-0 array con= taining the failed disk will simply fail. I assume it will still exist, but= refuse to be read or written. In this situation I understand HAST will han= dle this by routing all I/O to the secondary server, in case the disk on th= e primary side dies, or simply by cutting off replication if the disk on th= e secondary server fails. HAST sends all write requests to both nodes (if secondary is present) and read requests only to primary node. In some cases reads can be send to secondary node, for example when synchronization is in progress and secondary has more recent data or reading from local disk failed (either because of single EIO or entire disk went bad). In other words HAST itself can handle one of the mirrored disk failure. If entire hast/ dies for some reason (eg. secondary is down and local disk dies) then ZFS redundancy kicks in. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --oLBj+sq0vYjzfsbl Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk3VXgsACgkQForvXbEpPzQU1QCfbfpiBAKH71tOMJMKfUSIwp7Y WjMAn2R6hjssqi1y5oImzrgc0KrzAovY =lZEY -----END PGP SIGNATURE----- --oLBj+sq0vYjzfsbl-- From owner-freebsd-fs@FreeBSD.ORG Thu May 19 22:31:00 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 930E11065674; Thu, 19 May 2011 22:31:00 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id 410668FC1E; Thu, 19 May 2011 22:30:58 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id 64F5BC01CE; Fri, 20 May 2011 00:30:57 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cvlyTlR+jlFf; Fri, 20 May 2011 00:30:56 +0200 (CEST) Received: from [10.0.10.11] (unknown [212.112.191.49]) by zcs1.itassistans.net (Postfix) with ESMTPSA id DADEBC0181; Fri, 20 May 2011 00:30:56 +0200 (CEST) Message-ID: <4DD59A1D.7010406@itassistans.se> Date: Fri, 20 May 2011 00:30:53 +0200 From: Per von Zweigbergk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <4DD37C69.5020005@digsys.bg> <4DD3855E.8020802@itassistans.se> <20110519170740.GA2100@garage.freebsd.pl> In-Reply-To: <20110519170740.GA2100@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 22:31:00 -0000 On 2011-05-19 19:07, Pawel Jakub Dawidek wrote: > Having recursive ZFS pools is bad idea and most likely it was cause > deadocks. You also pay all the costs with checksumming, ARC cache, etc. > twice. Very bad idea. I've considered this. Checksumming can be disabled in ZFS on the filesystem level (so I guess you could easily disable it for an entire pool). The ARC cannot be disabled on a filesystem or pool level though, as far as I can tell, only on the entire machine level, which seems like a bad idea. Just the fact that there would be ARC duplication would be enough to make me seriously reconsider this. >>> Some reported they used HAST for the SLOG as well. I do not know >>> if using HAST for the L2ARC makes any sense. On failure you will >>> import the pool on the slave node and this will wipe the L2ARC >>> anyway. >> Yes, running HAST on L2ARC doesn't make much sense, I'd have to run >> HAST on the ZIL though if I opted for Variant 1 (which I don't think >> I will). > Using HAST for L2ARC devices might make no sense, but they are part of > the pool. So if you import the pool on another machine L2ARC device will > be failed. You may experiment with importing the pool, removing current > L2ARC devices and attaching machine-local L2ARC devices. This way you > avoid HAST for L2ARC, but not sure how reliable can that be. The KISS way to solve this would be to simply add both of the local L2ARC devices. So no matter on which node you import it, you're going to get an L2ARC imported, and one of them in a failed status because it can't find it. You'd have to live with the pool status being reported as degraded even though there is no problem though, which makes would make me inclined to simply script adding the L2ARC device when it is imported (and removing other cache devices), if that were the avenue I was pursuing. From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:03:52 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id AB224106566C; Thu, 19 May 2011 23:03:52 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id 3B1E78FC14; Thu, 19 May 2011 23:03:52 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id 11C19C01CE; Fri, 20 May 2011 01:03:51 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2QojWAKE3ypu; Fri, 20 May 2011 01:03:47 +0200 (CEST) Received: from [10.0.10.11] (unknown [212.112.191.49]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 07252C0181; Fri, 20 May 2011 01:03:47 +0200 (CEST) Message-ID: <4DD5A1CF.70807@itassistans.se> Date: Fri, 20 May 2011 01:03:43 +0200 From: Per von Zweigbergk User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: Pawel Jakub Dawidek References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> In-Reply-To: <20110519181436.GB2100@garage.freebsd.pl> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:03:52 -0000 On 2011-05-19 20:14, Pawel Jakub Dawidek wrote: > On Wed, May 18, 2011 at 08:13:13AM +0200, Per von Zweigbergk wrote: >> I've been investigating HAST as a possibility in adding synchronous replication and failover to a set of two NFS servers backed by NFS. The servers themselves contain quite a few disks. 20 of them (7200 RPM SAS disks), to be exact. (If I didn't lose count again...) Plus two quick but small SSD's for ZIL and two not-as-quick but larger SSD's for L2ARC. > [...] > > The configuration you should try first is to connect each disks pair > using HAST and create ZFS pool on top of those HAST devices. > > Let's assume you have 4 data disks (da0-da3), 2 SSD disks for ZIL > (da4-da5) and 2 SSD disks for L2ARC (da6-da7). > > Then you create the following HAST devices: > > /dev/hast/data0 = MachineA(da0) + MachineB(da0) > /dev/hast/data1 = MachineA(da1) + MachineB(da1) > /dev/hast/data2 = MachineA(da2) + MachineB(da2) > /dev/hast/data3 = MachineA(da3) + MachineB(da3) > > /dev/hast/slog0 = MachineA(da4) + MachineB(da4) > /dev/hast/slog1 = MachineA(da5) + MachineB(da5) > > /dev/hast/cache0 = MachineA(da6) + MachineB(da6) > /dev/hast/cache1 = MachineA(da7) + MachineB(da7) > > And then you create ZFS pool of your choice. Here you specify > redundancy, so if there is any you will have ZFS self-healing: > > zpool create tank raidz1 hast/data{0,1,2,3} log mirror hast/slog{0,1} cache hast/cache{0,1} Raidz on top of hast is one possibility, although raidz does add overhead to the equation. I'll have to find out how much. It's also possible to just mirror twice as well, although that would essentially mean that every write would go over the wire twice. Raidz might be the better bargain here, that would only increase the number of writes on the write by a factor 1/n where n is the number of data drives, at the cost of CPU to calculate parity. Testing will tell. >> 1. Hardware failure management. In case of a hardware failure, I'm not exactly sure what will happen, but I suspect the single-disk RAID-0 array containing the failed disk will simply fail. I assume it will still exist, but refuse to be read or written. In this situation I understand HAST will handle this by routing all I/O to the secondary server, in case the disk on the primary side dies, or simply by cutting off replication if the disk on the secondary server fails. > HAST sends all write requests to both nodes (if secondary is present) > and read requests only to primary node. In some cases reads can be send > to secondary node, for example when synchronization is in progress and > secondary has more recent data or reading from local disk failed (either > because of single EIO or entire disk went bad). > > In other words HAST itself can handle one of the mirrored disk failure. > > If entire hast/ dies for some reason (eg. secondary is down > and local disk dies) then ZFS redundancy kicks in. Very well, that is how failures are handled. But how do we *recover* from a disk failure? Without taking the entire server down that is. I already know how to deal with my HBA to hot-add and hot-remove devices. And how to deal with hardware failures on the *secondary* node seems fairly straightforward, after all, it doesn't really matter if the mirroring becomes degraded for a few seconds while I futz around with restarting hastd and such. The primary sees the secondary disappear a few seconds, when it comes back, it will just truck all of the dirty data over. Big deal. But what if the drive fails on the primary side? On the primary server I can't just restart hastd at my leisure, the underlying filesystem relies on it not going away. Ideally I'd want to just be able to tell hast that "hey, there's a new drive you can use, just suck over all the data from the secondary onto this drive, and route I/O from the secondary in the meantime" - without restarting hastd. Is this possible? Of course I could just avoid the problem by failing over the entire server whenever I want to replace hardware on the primary, making it the secondary. But causing a 20 second (just guessing about the actual failover time here) I/O hiccup in my virtualization environment just because I want to swap a hard drive seems unreasonable. These unresolved questions is why I would feel safer in simply running ZFS on the metal and running HAST on Zvols. :-) If running ZFS on top of a Zvol is a bad idea, there is always the option of simply exporting the HAST resource backed by Zvols as an iSCSI target and run VMFS on the drives. But that does mean losing some of the cooler features of ZFS which is a shame. From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:09:51 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 28E2D1065673 for ; Thu, 19 May 2011 23:09:51 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id C610E8FC18 for ; Thu, 19 May 2011 23:09:50 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 7F10D45E86; Fri, 20 May 2011 01:09:48 +0200 (CEST) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 4B3D145CDC; Fri, 20 May 2011 01:09:43 +0200 (CEST) Date: Fri, 20 May 2011 01:09:21 +0200 From: Pawel Jakub Dawidek To: Per von Zweigbergk Message-ID: <20110519230921.GF2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="OZkY3AIuv2LYvjdk" Content-Disposition: inline In-Reply-To: <4DD5A1CF.70807@itassistans.se> X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:09:51 -0000 --OZkY3AIuv2LYvjdk Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote: > Very well, that is how failures are handled. But how do we *recover* > from a disk failure? Without taking the entire server down that is. HAST opens local disk only when changing role to primary or changing role to secondary and accepting connection from primary. If your disk fails, switch to init for that HAST device, replace you disk, call 'hastctl create ' and switch back to primary or secondary. > I already know how to deal with my HBA to hot-add and hot-remove > devices. And how to deal with hardware failures on the *secondary* > node seems fairly straightforward, after all, it doesn't really > matter if the mirroring becomes degraded for a few seconds while I > futz around with restarting hastd and such. The primary sees the > secondary disappear a few seconds, when it comes back, it will just > truck all of the dirty data over. Big deal. You don't need to restart hastd or stop secondary. Just use hastctl to change role to init for the failing resource. > But what if the drive fails on the primary side? On the primary > server I can't just restart hastd at my leisure, the underlying > filesystem relies on it not going away. Ideally I'd want to just be > able to tell hast that "hey, there's a new drive you can use, just > suck over all the data from the secondary onto this drive, and route > I/O from the secondary in the meantime" - without restarting hastd. > Is this possible? Yes. > These unresolved questions is why I would feel safer in simply > running ZFS on the metal and running HAST on Zvols. :-) If running > ZFS on top of a Zvol is a bad idea, there is always the option of > simply exporting the HAST resource backed by Zvols as an iSCSI > target and run VMFS on the drives. But that does mean losing some of > the cooler features of ZFS which is a shame. I'd suggest to test configuration that seems best in theory and then see if it works for your or not. If not, then we can wonder what to do next. --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --OZkY3AIuv2LYvjdk Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk3VoyEACgkQForvXbEpPzRFBQCeJxsKlLR3h7/8+X9nHfVmKpXO EzIAoJWya2Cp6o58JnOENxViv3QRFkPX =tjW/ -----END PGP SIGNATURE----- --OZkY3AIuv2LYvjdk-- From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:22:59 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DB83E1065670; Thu, 19 May 2011 23:22:58 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com [209.85.213.182]) by mx1.freebsd.org (Postfix) with ESMTP id 6C8CE8FC12; Thu, 19 May 2011 23:22:58 +0000 (UTC) Received: by yxl31 with SMTP id 31so1427409yxl.13 for ; Thu, 19 May 2011 16:22:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=WT7LfSUGZjGBQDHfJxmgmjseHDgVbSiZMZ4g+ma8V9A=; b=TtRMCQb8hEOhu3hSkGj+LoSJxSoKD2oUkZibBE8/AUIltro85SdoJ6g9Tv38mdF8vV Tu6hjLkiityPbfsSdJ5eo1nK/0/d1+aQOl1xH00yTxtevy9ycoNAkuFrU4tG6CjQ0kYf TGgWXotBWdS2oK4Mg6ffOM/59swPdiNFAiZIw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=D04O2Ce5UiWVQXWXkz0mOk8VyO4uOu4VZfHNLuTdofMxxURYT9emGjUfYWCeVBT6Li I3T43uJzqDAPCXAqTnyYPMh0JJvnLPD5CPOXn6hYejD9UtpYcR9A9185hbuZlQQu7DiP MAzHHglLQqnGgXaAlgPN1p5el/iIqEar//GLM= MIME-Version: 1.0 Received: by 10.90.147.18 with SMTP id u18mr251911agd.95.1305847377816; Thu, 19 May 2011 16:22:57 -0700 (PDT) Received: by 10.90.138.17 with HTTP; Thu, 19 May 2011 16:22:57 -0700 (PDT) In-Reply-To: <20110519230921.GF2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> Date: Thu, 19 May 2011 16:22:57 -0700 Message-ID: From: Freddie Cash To: Pawel Jakub Dawidek Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:22:59 -0000 On Thu, May 19, 2011 at 4:09 PM, Pawel Jakub Dawidek wrote: > On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote: >> Very well, that is how failures are handled. But how do we *recover* >> from a disk failure? Without taking the entire server down that is. > > HAST opens local disk only when changing role to primary or changing > role to secondary and accepting connection from primary. > If your disk fails, switch to init for that HAST device, replace you > disk, call 'hastctl create ' and switch back to primary or > secondary. > >> I already know how to deal with my HBA to hot-add and hot-remove >> devices. And how to deal with hardware failures on the *secondary* >> node seems fairly straightforward, after all, it doesn't really >> matter if the mirroring becomes degraded for a few seconds while I >> futz around with restarting hastd and such. The primary sees the >> secondary disappear a few seconds, when it comes back, it will just >> truck all of the dirty data over. Big deal. > > You don't need to restart hastd or stop secondary. Just use hastctl to > change role to init for the failing resource. This process works exceedingly well. Just went through it a week or so ago. You just need to think in layers, the way GEOM works: Non-HAST setup HAST setup ------------------ ------------------ The non-HAST process for replacing a disk in a ZFS pool is: - zpool offline poolname diskname - remove dead disk - insert new disk - partition, label, etc as needed - zpool replace poolname olddisk newdisk - wait for resilver to complete With HAST, there's only a couple of small changes needed: - zpool offline poolname diskname <-- removes the /dev/hast node from the pool - hastctl role init diskname <-- removes the /dev/hast node - remove dead disk - insert new disk - partition, label, etc as needed - hastctl role create diskname <-- creates the hast resource - hastctl role primary diskname <-- creates the new /dev/hast node - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to pool - wait for resilver to complete The downside to this setup is that the data on the disk in the secondary node is lost, as the resilver of the disk on the primary node recreates all the data on the secondary node. But, at least then you know the data is good on both disks in the HAST resource. -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:26:19 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 966481065674 for ; Thu, 19 May 2011 23:26:19 +0000 (UTC) (envelope-from pawel@dawidek.net) Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60]) by mx1.freebsd.org (Postfix) with ESMTP id 3F2478FC0C for ; Thu, 19 May 2011 23:26:19 +0000 (UTC) Received: by mail.garage.freebsd.pl (Postfix, from userid 65534) id 0B9C045685; Fri, 20 May 2011 01:26:18 +0200 (CEST) Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mail.garage.freebsd.pl (Postfix) with ESMTP id 0255A45EA4; Fri, 20 May 2011 01:26:12 +0200 (CEST) Date: Fri, 20 May 2011 01:25:51 +0200 From: Pawel Jakub Dawidek To: Freddie Cash Message-ID: <20110519232551.GG2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="yRA+Bmk8aPhU85Qt" Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 9.0-CURRENT amd64 User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on mail.garage.freebsd.pl X-Spam-Level: X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL autolearn=no version=3.0.4 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:26:19 -0000 --yRA+Bmk8aPhU85Qt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, May 19, 2011 at 04:22:57PM -0700, Freddie Cash wrote: > With HAST, there's only a couple of small changes needed: > - zpool offline poolname diskname <-- removes the /dev/hast node > from the pool > - hastctl role init diskname <-- removes the /dev/hast node > - remove dead disk > - insert new disk > - partition, label, etc as needed > - hastctl role create diskname <-- creates the hast resource > - hastctl role primary diskname <-- creates the new /dev/hast n= ode > - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to > pool > - wait for resilver to complete >=20 > The downside to this setup is that the data on the disk in the secondary > node is lost, as the resilver of the disk on the primary node recreates a= ll > the data on the secondary node. But, at least then you know the data is > good on both disks in the HAST resource. It shouldn't be the case. Primary HAST node should synchronize data from secondary HAST node, as primary has new disk. This should allow you to simply 'zpool online poolname disk' instead of replacing it. It doesn't work that way for you? --=20 Pawel Jakub Dawidek http://www.wheelsystems.com FreeBSD committer http://www.FreeBSD.org Am I Evil? Yes, I Am! http://yomoli.com --yRA+Bmk8aPhU85Qt Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (FreeBSD) iEYEARECAAYFAk3Vpv8ACgkQForvXbEpPzRzYgCg0c70YunwrcHfbE9BGx7QvDAz pl8AnRZlWsk6AINDg6wREmHSWwyd/jNm =XkmN -----END PGP SIGNATURE----- --yRA+Bmk8aPhU85Qt-- From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:27:36 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5290C106566B; Thu, 19 May 2011 23:27:36 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id 0123C8FC19; Thu, 19 May 2011 23:27:35 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id C6B23C01CE; Fri, 20 May 2011 01:27:34 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id D7ZCZRyCgKYj; Fri, 20 May 2011 01:27:34 +0200 (CEST) Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se [213.89.160.61]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 306CCC0181; Fri, 20 May 2011 01:27:34 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) From: Per von Zweigbergk In-Reply-To: Date: Fri, 20 May 2011 01:27:32 +0200 Message-Id: <5B27EAAB-5D23-4844-B7C7-F83289BCABE7@itassistans.se> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> To: Freddie Cash X-Mailer: Apple Mail (2.1084) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:27:36 -0000 20 maj 2011 kl. 01.22 skrev Freddie Cash: > With HAST, there's only a couple of small changes needed: > - zpool offline poolname diskname <-- removes the /dev/hast = node from the pool What you're describing here is not what I asked about, activating a hot = spare drive without bringing down the HAST resource. You're describing taking the entire array offline while you perform work = on it. Which is fine in a lot of cases, but not exactly what I'd call = HA. :-)= From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:28:07 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BF1E31065673; Thu, 19 May 2011 23:28:07 +0000 (UTC) (envelope-from fjwcash@gmail.com) Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com [209.85.218.54]) by mx1.freebsd.org (Postfix) with ESMTP id 1F6748FC24; Thu, 19 May 2011 23:28:07 +0000 (UTC) Received: by yie12 with SMTP id 12so1426295yie.13 for ; Thu, 19 May 2011 16:28:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=ym8u8IGtfSN/rRKs6yU+hYsBIJeYFC6VGW1jWFWxQu0=; b=uKWjgXfFLCWGiek7IZHTk9TYnrr0j5vbLWGJrn45yAzMqkuFkuA45q1euPaQzJKdwK CyuMHtX+pV8Ly7G6mDHR1a54sTH6HqoKOT01qwGSrWG7CbUObxJ7nnGQK6OLEzqia3IF wafelRF/sylFZtgTEvQKy3sUTynJmuCaDR1KQ= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=Ox9r4SnS2COjOVXzPfC8EyqmNFqYaE5qZlYCT+GxuuKe33U0dntH0GP0+0NhzkyoCw CMapFNcm708yq3aSBHN93rwthMMLQ1eBmd52NItcrYhhvUijY/dK4z4lNRDqkH9903t/ Yss1tOzoHgda7OFiq3vnG/AB8vaQMqzDJAZdI= MIME-Version: 1.0 Received: by 10.90.147.18 with SMTP id u18mr257479agd.95.1305847686607; Thu, 19 May 2011 16:28:06 -0700 (PDT) Received: by 10.90.138.17 with HTTP; Thu, 19 May 2011 16:28:06 -0700 (PDT) In-Reply-To: <20110519232551.GG2100@garage.freebsd.pl> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> <20110519232551.GG2100@garage.freebsd.pl> Date: Thu, 19 May 2011 16:28:06 -0700 Message-ID: From: Freddie Cash To: Pawel Jakub Dawidek Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:28:07 -0000 On Thu, May 19, 2011 at 4:25 PM, Pawel Jakub Dawidek wrote: > On Thu, May 19, 2011 at 04:22:57PM -0700, Freddie Cash wrote: > > With HAST, there's only a couple of small changes needed: > > - zpool offline poolname diskname <-- removes the /dev/hast node > > from the pool > > - hastctl role init diskname <-- removes the /dev/hast node > > - remove dead disk > > - insert new disk > > - partition, label, etc as needed > > - hastctl role create diskname <-- creates the hast resource > > - hastctl role primary diskname <-- creates the new /dev/hast > node > > - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to > > pool > > - wait for resilver to complete > > > > The downside to this setup is that the data on the disk in the secondary > > node is lost, as the resilver of the disk on the primary node recreates > all > > the data on the secondary node. But, at least then you know the data is > > good on both disks in the HAST resource. > > It shouldn't be the case. Primary HAST node should synchronize data from > secondary HAST node, as primary has new disk. This should allow you to > simply 'zpool online poolname disk' instead of replacing it. > It doesn't work that way for you? > Oh? Never thought to try that. But, I guess that does make sense ... and is the point of having the redundant data in the other server ... Also, in my tests, I was running a degraded HAST setup (only 1 server), so it wouldn't have been possible to do. Will have to remember that for the next time I'm playing with HAST (the box is currently a non-HAST setup). -- Freddie Cash fjwcash@gmail.com From owner-freebsd-fs@FreeBSD.ORG Thu May 19 23:31:49 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2AFC8106564A; Thu, 19 May 2011 23:31:49 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id D20768FC1C; Thu, 19 May 2011 23:31:48 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id 82310C01CE; Fri, 20 May 2011 01:31:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id upmAsX0955-T; Fri, 20 May 2011 01:31:47 +0200 (CEST) Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se [213.89.160.61]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 07771C0181; Fri, 20 May 2011 01:31:47 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Per von Zweigbergk In-Reply-To: <20110519230921.GF2100@garage.freebsd.pl> Date: Fri, 20 May 2011 01:31:46 +0200 Content-Transfer-Encoding: 7bit Message-Id: References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> To: Pawel Jakub Dawidek X-Mailer: Apple Mail (2.1084) Cc: freebsd-fs@freebsd.org Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 May 2011 23:31:49 -0000 20 maj 2011 kl. 01.09 skrev Pawel Jakub Dawidek: > On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote: >> Very well, that is how failures are handled. But how do we *recover* >> from a disk failure? Without taking the entire server down that is. > > HAST opens local disk only when changing role to primary or changing > role to secondary and accepting connection from primary. > If your disk fails, switch to init for that HAST device, replace you > disk, call 'hastctl create ' and switch back to primary or > secondary. If I were to do 'hastctl role init foo' to switch from primary->init, /dev/hast/foo would go away, and this would degrade whatever file system or volume manager you're running on top of HAST. (I just tried this in my HAST lab environment.) The scenario I was describing was a primary disk failure, I want to keep being able to access /dev/hast/foo while I replace the primary disk. I still don't see how it's possible to hot-replace a failed drive in the server that's primary at the time, there just doesn't seem to be any way of bringing in a new disk on the primary side without bringing down the HAST resource. From owner-freebsd-fs@FreeBSD.ORG Fri May 20 00:11:12 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id A94D11065673; Fri, 20 May 2011 00:11:12 +0000 (UTC) (envelope-from pvz@itassistans.se) Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37]) by mx1.freebsd.org (Postfix) with ESMTP id 3535A8FC1A; Fri, 20 May 2011 00:11:12 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by zcs1.itassistans.net (Postfix) with ESMTP id 2E5A2C01CE; Fri, 20 May 2011 02:11:11 +0200 (CEST) X-Virus-Scanned: amavisd-new at zcs1.itassistans.net Received: from zcs1.itassistans.net ([127.0.0.1]) by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id j359LKl7CxRR; Fri, 20 May 2011 02:11:07 +0200 (CEST) Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se [213.89.160.61]) by zcs1.itassistans.net (Postfix) with ESMTPSA id 445E6C0181; Fri, 20 May 2011 02:11:07 +0200 (CEST) Mime-Version: 1.0 (Apple Message framework v1084) From: Per von Zweigbergk In-Reply-To: <5B27EAAB-5D23-4844-B7C7-F83289BCABE7@itassistans.se> Date: Fri, 20 May 2011 02:11:06 +0200 Message-Id: <61D2B7A3-1778-4A42-8983-8C325D2F849E@itassistans.se> References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se> <20110519181436.GB2100@garage.freebsd.pl> <4DD5A1CF.70807@itassistans.se> <20110519230921.GF2100@garage.freebsd.pl> <5B27EAAB-5D23-4844-B7C7-F83289BCABE7@itassistans.se> To: Per von Zweigbergk X-Mailer: Apple Mail (2.1084) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek Subject: Re: HAST + ZFS self healing? Hot spares? X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 00:11:12 -0000 20 maj 2011 kl. 01.27 skrev Per von Zweigbergk: > You're describing taking the entire array offline while you perform = work on it. My apologies, I was a bit too quick reading what you (Freddie Cash) = wrote. What you're describing is relying on ZFS's own redundancy while you = replace the failed disk, bringing down the entire HAST resource just so = you can replace one of the two failed disks. The only reason the ZFS = array continues to function is because it's redundant in ZFS itself. Ideally, the HAST resource could continue to remain operational while = the failed disk was replaced. After all, it can remain operational while = the primary disk has failed, and it can remain operational while the = data is being resynchronized, so why would the resource need to be = brought down just to transition between these two states? I guess it's because HAST isn't quite "finished" yet feature-wise, and = that particular feature does not yet exist. Still, I suppose this is good enough, this just shows that raidz:ing = together a bunch of HAST mirrors solves one and a half of my operational = problems - replacing failed drives (by momentarily downing the whole = HAST resource while work is being done) and providing checksumming = capability (although not self-healing). The setup described (a bunch of HAST mirrors in a raidz) will not = self-heal entirely. Imagine if a bit error occurred while writing to one = of the secondary disks. Since that data is never read by ZFS or HAST, = the error would remain undetected. To ensure data integrity on both the = primary and secondary servers, you'd have to failover the servers once = every N days/weeks/months (depending on your operational requirements) = and perform a zfs scrub on "both sides" of the HAST resource, as part of = regular maintenance. It'd probably even be scriptable, assuming you can = live with a few seconds of scheduled downtime during the switchover.= From owner-freebsd-fs@FreeBSD.ORG Fri May 20 03:17:23 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 41877106564A for ; Fri, 20 May 2011 03:17:23 +0000 (UTC) (envelope-from mckusick@mckusick.com) Received: from chez.mckusick.com (chez.mckusick.com [64.81.247.49]) by mx1.freebsd.org (Postfix) with ESMTP id 1C3778FC16 for ; Fri, 20 May 2011 03:17:23 +0000 (UTC) Received: from chez.mckusick.com (localhost [127.0.0.1]) by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p4K3G6EU039569; Thu, 19 May 2011 20:16:06 -0700 (PDT) (envelope-from mckusick@chez.mckusick.com) Message-Id: <201105200316.p4K3G6EU039569@chez.mckusick.com> To: lev@freebsd.org In-reply-to: <1606289061.20110519211755@serebryakov.spb.ru> Date: Thu, 19 May 2011 20:16:06 -0700 From: Kirk McKusick X-Spam-Status: No, score=1.3 required=5.0 tests=MISSING_MID,PLING_QUERY, UNPARSEABLE_RELAY autolearn=no version=3.2.5 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com Cc: freebsd-fs@freebsd.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 03:17:23 -0000 > Date: Thu, 19 May 2011 21:17:55 +0400 > From: Lev Serebryakov > To: freebsd-fs@freebsd.org > > Hello, Freebsd-fs. > > I have /usr/home partition on my new server which is 400GiB (only > 17GiB is used). It is UFS2, SoftUpdates are enabled. > > I want to backup it on live system, but 4 times out of 5 I got > (after 10-12 minutes of wait! Oh my, 10 minutes to create snapshot!): > > mksnap_ffs: Cannot create snapshot /usr/home/.snap/dump_snapshot: Resource temporarily unavailable > dump: Cannot create /usr/home/.snap/dump_snapshot: No such file or directory > > It is FreeBSD 8.2-STABLE/amd64, 8GiB of memory. > > I've never encounter such problem on previous server, which has > about 80GiB (with 20GiB used). > > -- > // Black Lion AKA Lev Serebryakov > > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Given the size of your storage, you should consider using ZFS which is better able to handle such large systems better. My second suggestion is that you try building UFS2 with 32K blocks and 4K fragments. That will reduce the number of resources needed to take the snapshot. Kirk McKusick From owner-freebsd-fs@FreeBSD.ORG Fri May 20 05:54:38 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BEBD11065672 for ; Fri, 20 May 2011 05:54:38 +0000 (UTC) (envelope-from bf1783@googlemail.com) Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com [209.85.220.182]) by mx1.freebsd.org (Postfix) with ESMTP id 78CD08FC12 for ; Fri, 20 May 2011 05:54:38 +0000 (UTC) Received: by vxc34 with SMTP id 34so3378580vxc.13 for ; Thu, 19 May 2011 22:54:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=googlemail.com; s=gamma; h=domainkey-signature:mime-version:reply-to:date:message-id:subject :from:to:cc:content-type; bh=IRvRePyx14hJhHOFdTqcHERjoeNNtdFkDt5yPxbErDk=; b=RAe6kGuDu4YvhlKIFOHb74blzXHvHoZEI3neQFHum3RnGMAjQog4Zkdc+ny+0Tm71x 1lMq9A6zHy6tXT2ED/dnNfnIFVHesyG+h07OL2UT6o6LEVcfoNA36SQlAaDZLLPIcswP 8+YLXjd0KnOENsqn+xqjvv+jUN0jLE1OdCMM4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:cc :content-type; b=nN+XvlGtbw4Gfzs06Qo/j96zTL3DwB7h0n9h8yKMWPl2AIaYRbr6W7+Uwy4ZlwJYE0 M+QGFFKqxWm+9LnNbdN9xT8rqcto/bOruVcDZVcH+zoSXHw5tDl9G6I9XQg5cLOyMF7+ 8dMHRJh1Xqal9hQUgI1ygp5/4ML1rE0G5K5Lw= MIME-Version: 1.0 Received: by 10.52.97.7 with SMTP id dw7mr5762671vdb.109.1305869527287; Thu, 19 May 2011 22:32:07 -0700 (PDT) Received: by 10.52.110.231 with HTTP; Thu, 19 May 2011 22:32:07 -0700 (PDT) Date: Fri, 20 May 2011 01:32:07 -0400 Message-ID: From: "b. f." To: freebsd-questions@FreeBSD.org, freebsd-fs@FreeBSD.org Content-Type: text/plain; charset=ISO-8859-1 Cc: grarpamp Subject: Re: UDF and DVD's X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: bf1783@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 05:54:38 -0000 grarpamp wrote: ... > I'm guessing the current state within FreeBSD means that I can > neither read, nor create, or write, readable (compatible) images > at this, or any given, UDF level? ... > > Is this a blocker for FreeBSD? > > For me, at least, minimally, that seems to be the case... as I now > have no way to rip, mount and add the files to this DVD that I would > like to add. Except to use Windows, which I consider to be unreliable > at best. Obviously, the base system UDF support is minimal and needs some work. But you may find that ports like sysutils/cdrtools[-devel] or sysutils/udfclient will allow you to do much of what you want to do. b. From owner-freebsd-fs@FreeBSD.ORG Fri May 20 08:29:39 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64C3C106566C for ; Fri, 20 May 2011 08:29:39 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 020378FC1A for ; Fri, 20 May 2011 08:29:39 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 25E924AC1C; Fri, 20 May 2011 12:29:37 +0400 (MSD) Date: Fri, 20 May 2011 12:29:33 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <795474996.20110520122933@serebryakov.spb.ru> To: Kirk McKusick In-Reply-To: <201105200316.p4K3G6EU039569@chez.mckusick.com> References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@freebsd.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 08:29:39 -0000 Hello, Kirk. You wrote 20 =EC=E0=FF 2011 =E3., 7:16:06: > Given the size of your storage, you should consider using ZFS > which is better able to handle such large systems better. Yes, I know, that everybody loves ZFS now, but it doesn't have two characteristics which is important for my installation: (1) nodump flag or any other way to mark directories and files as not-importand for backup. "zfs send" is all-or-nothing solution, and now my users use "nodump" to reduce backup sizes greatly. (2) Incremental backups with a little of local information (zfs send can send difference between snapshots, but system needs to store old snapshot for this). Second one is not so important yet, because there is a lot of free space, but "zfs send" could not do anything with (1) :( All other backups solutions doesn't store full FS information, as works on file level, not FS one :( > My second suggestion is that you try building UFS2 with 32K > blocks and 4K fragments. That will reduce the number of resources > needed to take the snapshot. I'll try this. But I remember, that some time ago (about 7.1-STABLE) there was deadlock in kernel memory allocator when different UFSes on system uses different block sizes... --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri May 20 08:42:54 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3C84A106564A; Fri, 20 May 2011 08:42:54 +0000 (UTC) (envelope-from grarpamp@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 0CD598FC16; Fri, 20 May 2011 08:42:53 +0000 (UTC) Received: by pwj8 with SMTP id 8so2076366pwj.13 for ; Fri, 20 May 2011 01:42:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=3COreA6r/gy3USqKbuADG1H/LNZh2ZVZkMbijgM34xs=; b=TofCb5dlSkRqQMxsHuYbSiesC60WzYyPrV5XvCRCY8gfCCp1hLu4XS/WcKnJA1tM4X IBgLepZdp8/dwNjIOcChs4N4anCU3015t99cWmsd2C0OHeWq22usXVu9W0yTT6S97teG +BV11DfuPezATJqfY2ERhlXJuz1i01blme6UU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=IhAVzTnh13UQmoDotpBw05RrYazhN3iB94n4MN9aU7Y0uZSytdZD21L0pcqxGuv/wD kjbGR6Hepu6gNW+P6K5dRu6eC21ZQMUT0O+kEYzZgu3HzyGSVLCRZSdQNzR79zw48jyh y5GPLdZrJS4rUcpHobrEj5cDLIjRKA7g2q02Q= MIME-Version: 1.0 Received: by 10.142.230.6 with SMTP id c6mr2585560wfh.415.1305880973697; Fri, 20 May 2011 01:42:53 -0700 (PDT) Received: by 10.142.157.2 with HTTP; Fri, 20 May 2011 01:42:53 -0700 (PDT) In-Reply-To: <20110519091433.GA94053@icarus.home.lan> References: <20110519091433.GA94053@icarus.home.lan> Date: Fri, 20 May 2011 04:42:53 -0400 Message-ID: From: grarpamp To: Jeremy Chadwick Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: UDF and DVD's X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 08:42:54 -0000 > Thoughts: please provide commands, full output, etc. that show how > you're trying to mount the disc, as well as relevant /dev entries > pertaining to your DVD drive. dmesg might also be helpful. And I > assume you have looked at mount_udf(8)? Apologies, it is late. However I used only the obvious. Hopefully obviously, my DVD drive is irrelevant in this case... mdconfig -f -o readonly mount_cd9660 -v -o ro /mnt ls -alR /mnt [*not* 2.5GiB of files, but...] cat /mnt/readme.txt This disc contains a "UDF" file system and requires an operating system that supports the ISO-13346 "UDF" file system specification. umount -v /mnt mount_udf -v -o ro /mnt mount_udf: /dev/md[n]: Invalid argument [md dev is not mounted on /mnt] I think it's related to the UDF version of the image. As anyone can verify using my said images found on the internet, Perhaps it begs for a NetBSD port? I tested with: RELENG_8 i386. BTW, mdconfig is also broken in that it should take arguments regardless of position, but it does not. IE: try transposing -d and -u, or -o. = failure to execute. From owner-freebsd-fs@FreeBSD.ORG Fri May 20 08:47:18 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C3B081065670; Fri, 20 May 2011 08:47:18 +0000 (UTC) (envelope-from grarpamp@gmail.com) Received: from mail-px0-f176.google.com (mail-px0-f176.google.com [209.85.212.176]) by mx1.freebsd.org (Postfix) with ESMTP id 94B118FC0A; Fri, 20 May 2011 08:47:18 +0000 (UTC) Received: by pxi11 with SMTP id 11so2855633pxi.7 for ; Fri, 20 May 2011 01:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=blirRTQCZDD6c1UGFgvOxQW4gvPoZVsdDCV4Gsm7AgU=; b=tWUFD3NBDGpWpf0hCcX75uN2DFFBOP5kv1rc1xIPhDX3sLcUM4JNQq8Ssghkm+zYcj 5oR8qH9GVrjTlBu3k5YaCmOPdBa3o1fHxIxJosEyql3gIiKXLa9KVrkEntum8y3AHSEe LvAkHP1hsRzv7W3pNAvzxJcJYy7W69Xo+7j1E= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=oXER/+4mOQC/cqQe85TJ7XrgW2hH9k44xDAi9elKhF8EXRKqZRyO/b+l+GSR1MIKxz gmB5kLQtdO1bQyjjjwqlTpWm4lbefEE0hc9prT+cDVLNiE3rJCaHzBAw6hqqBU8zn2Fm 1f0yxWrG/0z31OL9fxc0xKYw4Ub1ND649qxM4= MIME-Version: 1.0 Received: by 10.142.249.34 with SMTP id w34mr2435874wfh.301.1305881237993; Fri, 20 May 2011 01:47:17 -0700 (PDT) Received: by 10.142.157.2 with HTTP; Fri, 20 May 2011 01:47:17 -0700 (PDT) In-Reply-To: References: Date: Fri, 20 May 2011 04:47:17 -0400 Message-ID: From: grarpamp To: bf1783@gmail.com Content-Type: text/plain; charset=UTF-8 Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org Subject: Re: UDF and DVD's X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 08:47:18 -0000 > Obviously, the base system UDF support is minimal and needs some work. > But you may find that ports like sysutils/cdrtools[-devel] or > sysutils/udfclient will allow you to do much of what you want to do. Hmm. perhaps I may be able to create and burn [both modes occurring in userland] with cdrtools. But certainly not to read or write in kernel mode yet AFAICT. I'll investigate udfclient, that is new to me as a userland tool. I was hoping for kernel level compatibility. As are, I suspect, we all :) From owner-freebsd-fs@FreeBSD.ORG Fri May 20 13:09:38 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 26C1C106566C; Fri, 20 May 2011 13:09:38 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 3AE9A8FC20; Fri, 20 May 2011 13:09:36 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA00712; Fri, 20 May 2011 16:09:34 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4DD6680E.9040006@FreeBSD.org> Date: Fri, 20 May 2011 16:09:34 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110504 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: lev@FreeBSD.org References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> <795474996.20110520122933@serebryakov.spb.ru> In-Reply-To: <795474996.20110520122933@serebryakov.spb.ru> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 8bit Cc: Kirk McKusick , freebsd-fs@FreeBSD.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 13:09:38 -0000 on 20/05/2011 11:29 Lev Serebryakov said the following: > Hello, Kirk. > You wrote 20 ìàÿ 2011 ã., 7:16:06: > >> Given the size of your storage, you should consider using ZFS >> which is better able to handle such large systems better. > Yes, I know, that everybody loves ZFS now, but it doesn't have two > characteristics which is important for my installation: > > (1) nodump flag or any other way to mark directories and files as > not-importand for backup. "zfs send" is all-or-nothing solution, and > now my users use "nodump" to reduce backup sizes greatly. Two options: a) you don't have to zfs send all filesystems, just the ones that you really need; and you can easily create many filesystems with ZFS; you can tag filesystems that you do not want to backup with user properties. b) you can use something else for backups Besides, zfs send / receive best works for replicating data. Storing results of zfs send for later restoration is not a good idea, IMO. > (2) Incremental backups with a little of local information (zfs send > can send difference between snapshots, but system needs to store old > snapshot for this). > Second one is not so important yet, because there is a lot of free space, > but "zfs send" could not do anything with (1) :( > > All other backups solutions doesn't store full FS information, as > works on file level, not FS one :( This sounds more like a theoretical than practical objection. If you don't lose any information that you actually need, then a solution works. Take a look at e.g. archivers/star. >> My second suggestion is that you try building UFS2 with 32K >> blocks and 4K fragments. That will reduce the number of resources >> needed to take the snapshot. > I'll try this. But I remember, that some time ago (about 7.1-STABLE) > there was deadlock in kernel memory allocator when different UFSes > on system uses different block sizes... > -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri May 20 16:45:55 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id F27A61065670; Fri, 20 May 2011 16:45:55 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 903A88FC1B; Fri, 20 May 2011 16:45:55 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 8A0264AC1C; Fri, 20 May 2011 20:45:53 +0400 (MSD) Date: Fri, 20 May 2011 20:45:49 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1408884696.20110520204549@serebryakov.spb.ru> To: Andriy Gapon In-Reply-To: <4DD6680E.9040006@FreeBSD.org> References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> <795474996.20110520122933@serebryakov.spb.ru> <4DD6680E.9040006@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@FreeBSD.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 16:45:56 -0000 Hello, Andriy. You wrote 20 =EC=E0=FF 2011 =E3., 17:09:34: >>> Given the size of your storage, you should consider using ZFS >>> which is better able to handle such large systems better. >> Yes, I know, that everybody loves ZFS now, but it doesn't have two >> characteristics which is important for my installation: >>=20 >> (1) nodump flag or any other way to mark directories and files as >> not-importand for backup. "zfs send" is all-or-nothing solution, and >> now my users use "nodump" to reduce backup sizes greatly. > Two options: > a) you don't have to zfs send all filesystems, just the ones that you rea= lly need; > and you can easily create many filesystems with ZFS; you can tag filesyst= ems that > you do not want to backup with user properties. Yes, _I_ can create many FSes. Not users. If user want to mark this part of this site as non-important (for example, because it is cache of it image gallery which stores thumbnails, which could take lot of space but re-creatable on demand), I will need to create yet another FS on his request. It is not a option. > Take a look at e.g. archivers/star. I'll take a look. If it could skip some directories, marked with special file (like gtar could), it could be a solution. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri May 20 16:54:15 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 3DE74106566B; Fri, 20 May 2011 16:54:15 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id 3B34D8FC15; Fri, 20 May 2011 16:54:13 +0000 (UTC) Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua [212.40.38.101]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA03196; Fri, 20 May 2011 19:54:11 +0300 (EEST) (envelope-from avg@FreeBSD.org) Message-ID: <4DD69CB3.2050601@FreeBSD.org> Date: Fri, 20 May 2011 19:54:11 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110504 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: lev@FreeBSD.org References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> <795474996.20110520122933@serebryakov.spb.ru> <4DD6680E.9040006@FreeBSD.org> <1408884696.20110520204549@serebryakov.spb.ru> In-Reply-To: <1408884696.20110520204549@serebryakov.spb.ru> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@FreeBSD.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 16:54:15 -0000 on 20/05/2011 19:45 Lev Serebryakov said the following: > Hello, Andriy. > You wrote 20 ìàÿ 2011 ã., 17:09:34: >> Take a look at e.g. archivers/star. > I'll take a look. If it could skip some directories, marked with > special file (like gtar could), it could be a solution. I think that it understands FreeBSD flags and supports nodump flag. -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri May 20 18:19:18 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 2B4FA106566C; Fri, 20 May 2011 18:19:18 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id BBC2B8FC15; Fri, 20 May 2011 18:19:17 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id E31474AC1C; Fri, 20 May 2011 22:19:15 +0400 (MSD) Date: Fri, 20 May 2011 22:19:11 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1491112642.20110520221911@serebryakov.spb.ru> To: Andriy Gapon In-Reply-To: <4DD69CB3.2050601@FreeBSD.org> References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> <795474996.20110520122933@serebryakov.spb.ru> <4DD6680E.9040006@FreeBSD.org> <1408884696.20110520204549@serebryakov.spb.ru> <4DD69CB3.2050601@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@FreeBSD.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 18:19:18 -0000 Hello, Andriy. You wrote 20 =EC=E0=FF 2011 =E3., 20:54:11: >> You wrote 20 =EC=E0=FF 2011 =E3., 17:09:34: >>> Take a look at e.g. archivers/star. >> I'll take a look. If it could skip some directories, marked with >> special file (like gtar could), it could be a solution. > I think that it understands FreeBSD flags and supports nodump flag. I don't need star for FSes fwith "nodump" flag. star from FFS2 without snapshot is not very good solution in any case, IMHO. So, I need any-tar or any other solution with non-FS-specific "nodump" indication for ZFS OR working snapshots on FFS. --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Fri May 20 18:52:03 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C1974106566C; Fri, 20 May 2011 18:52:03 +0000 (UTC) (envelope-from avg@FreeBSD.org) Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140]) by mx1.freebsd.org (Postfix) with ESMTP id D26378FC1C; Fri, 20 May 2011 18:52:02 +0000 (UTC) Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua [212.40.38.100]) by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA04050; Fri, 20 May 2011 21:52:00 +0300 (EEST) (envelope-from avg@FreeBSD.org) Received: from localhost.topspin.kiev.ua ([127.0.0.1]) by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD)) id 1QNUno-000HkV-HN; Fri, 20 May 2011 21:52:00 +0300 Message-ID: <4DD6B84F.20706@FreeBSD.org> Date: Fri, 20 May 2011 21:51:59 +0300 From: Andriy Gapon User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US; rv:1.9.2.17) Gecko/20110503 Lightning/1.0b2 Thunderbird/3.1.10 MIME-Version: 1.0 To: lev@FreeBSD.org References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> <795474996.20110520122933@serebryakov.spb.ru> <4DD6680E.9040006@FreeBSD.org> <1408884696.20110520204549@serebryakov.spb.ru> <4DD69CB3.2050601@FreeBSD.org> <1491112642.20110520221911@serebryakov.spb.ru> In-Reply-To: <1491112642.20110520221911@serebryakov.spb.ru> X-Enigmail-Version: 1.1.2 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: 8bit Cc: freebsd-fs@FreeBSD.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 18:52:03 -0000 on 20/05/2011 21:19 Lev Serebryakov said the following: > Hello, Andriy. > You wrote 20 ìàÿ 2011 ã., 20:54:11: > >>> You wrote 20 ìàÿ 2011 ã., 17:09:34: >>>> Take a look at e.g. archivers/star. >>> I'll take a look. If it could skip some directories, marked with >>> special file (like gtar could), it could be a solution. >> I think that it understands FreeBSD flags and supports nodump flag. > I don't need star for FSes fwith "nodump" flag. star from FFS2 > without snapshot is not very good solution in any case, IMHO. > > So, I need any-tar or any other solution with non-FS-specific "nodump" > indication for ZFS OR working snapshots on FFS. $ chflags nodump work $ ls -ldo work drwxr-xr-x 2 avg staff nodump 4 18 May 2008 work $ df -T . Filesystem Type 1K-blocks Used Avail Capacity Mounted on pond/usr/home/avg/tmp zfs 114487866 7629783 106858083 7% /usr/home/avg/tmp Does this look good enough for you? -- Andriy Gapon From owner-freebsd-fs@FreeBSD.ORG Fri May 20 18:56:01 2011 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D15AB1065673; Fri, 20 May 2011 18:56:01 +0000 (UTC) (envelope-from lev@FreeBSD.org) Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru [IPv6:2a01:4f8:131:60a2::2]) by mx1.freebsd.org (Postfix) with ESMTP id 93D9A8FC13; Fri, 20 May 2011 18:56:01 +0000 (UTC) Received: from lion.home.serebryakov.spb.ru (unknown [IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3]) (Authenticated sender: lev@serebryakov.spb.ru) by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id D6A064AC1C; Fri, 20 May 2011 22:55:59 +0400 (MSD) Date: Fri, 20 May 2011 22:55:55 +0400 From: Lev Serebryakov Organization: FreeBSD Project X-Priority: 3 (Normal) Message-ID: <1248410630.20110520225555@serebryakov.spb.ru> To: Andriy Gapon In-Reply-To: <4DD6B84F.20706@FreeBSD.org> References: <1606289061.20110519211755@serebryakov.spb.ru> <201105200316.p4K3G6EU039569@chez.mckusick.com> <795474996.20110520122933@serebryakov.spb.ru> <4DD6680E.9040006@FreeBSD.org> <1408884696.20110520204549@serebryakov.spb.ru> <4DD69CB3.2050601@FreeBSD.org> <1491112642.20110520221911@serebryakov.spb.ru> <4DD6B84F.20706@FreeBSD.org> MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1251 Content-Transfer-Encoding: quoted-printable Cc: freebsd-fs@FreeBSD.org Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup /usr/home?! X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: lev@FreeBSD.org List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 May 2011 18:56:01 -0000 Hello, Andriy. You wrote 20 =EC=E0=FF 2011 =E3., 22:51:59: > $ chflags nodump work > $ ls -ldo work > drwxr-xr-x 2 avg staff nodump 4 18 May 2008 work > $ df -T . > Filesystem Type 1K-blocks Used Avail Capacity Mounted= on > pond/usr/home/avg/tmp zfs 114487866 7629783 106858083 7% /usr/ho= me/avg/tmp > Does this look good enough for you? Ooops, I've missed, when flags were added to ZFS. Sorry :) --=20 // Black Lion AKA Lev Serebryakov From owner-freebsd-fs@FreeBSD.ORG Sat May 21 06:39:59 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BFEB7106564A for ; Sat, 21 May 2011 06:39:59 +0000 (UTC) (envelope-from grarpamp@gmail.com) Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com [209.85.160.54]) by mx1.freebsd.org (Postfix) with ESMTP id 9BCB98FC0A for ; Sat, 21 May 2011 06:39:59 +0000 (UTC) Received: by pwj8 with SMTP id 8so2578324pwj.13 for ; Fri, 20 May 2011 23:39:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:date:message-id:subject:from:to :content-type; bh=8v3M2pGzpY5zcrRJQ3NcDAPvuI8vzWT1PmpQ1mxao5U=; b=Qu44zcN636hp0EfnpoIvhcgbtfYEq56wEgJhtuU55AEXUtqvPZ4cTS7VdvP2GMi8w5 7mnotyyFOZoNZiCgwG7XzrfezJ8K19wBba3TED7+zL0ebzNMm9O7pgf6Zn6VOybr72OP BEV72WBzP/tLAunOMJcrGXDZWFMoMqmJ1FTJA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=Bwj6EHCLiV6O0oEkGazNZJ1kkbhh2sjSTMCXuHUmMcVHPjYpc3T4ovnwfHDcSS8Xkq B8Um6t/TroiYJPdq+goFXxCeGNk4zT9O5PmsjhXdpB2iv94JehPHA22un64nvpbWq/ei RKmuiTy9YFrMuqut0kESmCCzf5tYV4C2XbyEs= MIME-Version: 1.0 Received: by 10.142.121.41 with SMTP id t41mr148563wfc.358.1305959999077; Fri, 20 May 2011 23:39:59 -0700 (PDT) Received: by 10.142.157.2 with HTTP; Fri, 20 May 2011 23:39:59 -0700 (PDT) Date: Sat, 21 May 2011 02:39:59 -0400 Message-ID: From: grarpamp To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=UTF-8 Subject: Write reallocator X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 May 2011 06:39:59 -0000 I've got a disk that I'd like to excercise in order to see if it will reallocate marginal reads when written to. Normally I'd just zero the thing, destroy and toss it. But I feel like playing more. Because the data is still semi valuable, I want to read and write back every block of the disk. Any tools that will do this? Besides dd and shell math? Also, as with SCSI drives and camcontrol, is there a decent ATA mode page editor out there? Even if on Windows. Maybe this is more for the hardware list? From owner-freebsd-fs@FreeBSD.ORG Sat May 21 07:32:10 2011 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id DBE27106566C for ; Sat, 21 May 2011 07:32:10 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 9D74F8FC15 for ; Sat, 21 May 2011 07:32:10 +0000 (UTC) Received: from critter.freebsd.dk (critter-phk.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 7264F5E10; Sat, 21 May 2011 07:13:35 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id p4L7DZCf006807; Sat, 21 May 2011 07:13:35 GMT (envelope-from phk@critter.freebsd.dk) To: grarpamp From: "Poul-Henning Kamp" In-Reply-To: Your message of "Sat, 21 May 2011 02:39:59 -0400." Content-Type: text/plain; charset=ISO-8859-1 Date: Sat, 21 May 2011 07:13:35 +0000 Message-ID: <6806.1305962015@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-fs@freebsd.org Subject: Re: Write reallocator X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 21 May 2011 07:32:10 -0000 In message , grarpamp write s: >I've got a disk that I'd like to excercise in order to >see if it will reallocate marginal reads when >written to. Normally I'd just zero the thing, >destroy and toss it. But I feel like playing more. >Because the data is still semi valuable, I want to >read and write back every block of the disk. >Any tools that will do this? Recoverdisk ? -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence.