From owner-freebsd-fs@FreeBSD.ORG  Mon May 16 11:07:03 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C5D7D106564A
	for <freebsd-fs@FreeBSD.org>; Mon, 16 May 2011 11:07:03 +0000 (UTC)
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id A9D208FC21
	for <freebsd-fs@FreeBSD.org>; Mon, 16 May 2011 11:07:03 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p4GB73om071199
	for <freebsd-fs@FreeBSD.org>; Mon, 16 May 2011 11:07:03 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Received: (from gnats@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p4GB7253071197
	for freebsd-fs@FreeBSD.org; Mon, 16 May 2011 11:07:02 GMT
	(envelope-from owner-bugmaster@FreeBSD.org)
Date: Mon, 16 May 2011 11:07:02 GMT
Message-Id: <201105161107.p4GB7253071197@freefall.freebsd.org>
X-Authentication-Warning: freefall.freebsd.org: gnats set sender to
	owner-bugmaster@FreeBSD.org using -f
From: FreeBSD bugmaster <bugmaster@FreeBSD.org>
To: freebsd-fs@FreeBSD.org
Cc: 
Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2011 11:07:03 -0000

Note: to view an individual PR, use:
  http://www.freebsd.org/cgi/query-pr.cgi?pr=(number).

The following is a listing of current problems submitted by FreeBSD users.
These represent problem reports covering all versions including
experimental development code and obsolete releases.


S Tracker      Resp.      Description
--------------------------------------------------------------------------------
o kern/156933  fs         [zfs] ZFS receive after read on readonly=on filesystem
o kern/156797  fs         [zfs] [panic] Double panic with FreeBSD 9-CURRENT and 
o kern/156781  fs         [zfs] zfs is losing the snapshot directory,
p kern/156545  fs         [ufs] mv could break UFS on SMP systems
o kern/156193  fs         [ufs] [hang] UFS snapshot hangs && deadlocks processes
o kern/156168  fs         [nfs] [panic] Kernel panic under concurrent access ove
o kern/156039  fs         [nullfs] [unionfs] nullfs + unionfs do not compose, re
o kern/155615  fs         [zfs] zfs v28 broken on sparc64 -current
o kern/155587  fs         [zfs] [panic] kernel panic with zfs
o kern/155484  fs         [ufs] GPT + UFS boot don't work well together
o kern/155411  fs         [regression] [8.2-release] [tmpfs]: mount: tmpfs : No 
o kern/155199  fs         [ext2fs] ext3fs mounted as ext2fs gives I/O errors
o bin/155104   fs         [zfs][patch] use /dev prefix by default when importing
o kern/154930  fs         [zfs] cannot delete/unlink file from full volume -> EN
o kern/154828  fs         [msdosfs] Unable to create directories on external USB
o kern/154491  fs         [smbfs] smb_co_lock: recursive lock for object 1
o kern/154447  fs         [zfs] [panic] Occasional panics - solaris assert somew
p kern/154228  fs         [md] md getting stuck in wdrain state
o kern/153996  fs         [zfs] zfs root mount error while kernel is not located
o kern/153847  fs         [nfs] [panic] Kernel panic from incorrect m_free in nf
o kern/153753  fs         [zfs] ZFS v15 - grammatical error when attempting to u
o kern/153716  fs         [zfs] zpool scrub time remaining is incorrect
o kern/153695  fs         [patch] [zfs] Booting from zpool created on 4k-sector 
o kern/153680  fs         [xfs] 8.1 failing to mount XFS partitions
o kern/153520  fs         [zfs] Boot from GPT ZFS root on HP BL460c G1 unstable
o kern/153418  fs         [zfs] [panic] Kernel Panic occurred writing to zfs vol
o kern/153351  fs         [zfs] locking directories/files in ZFS
o bin/153258   fs         [patch][zfs] creating ZVOLs requires `refreservation' 
s kern/153173  fs         [zfs] booting from a gzip-compressed dataset doesn't w
o kern/153126  fs         [zfs] vdev failure, zpool=peegel type=vdev.too_small
p kern/152488  fs         [tmpfs] [patch] mtime of file updated when only inode 
o kern/152022  fs         [nfs] nfs service hangs with linux client [regression]
o kern/151942  fs         [zfs] panic during ls(1) zfs snapshot directory
o kern/151905  fs         [zfs] page fault under load in /sbin/zfs
o kern/151845  fs         [smbfs] [patch] smbfs should be upgraded to support Un
o bin/151713   fs         [patch] Bug in growfs(8) with respect to 32-bit overfl
o kern/151648  fs         [zfs] disk wait bug
o kern/151629  fs         [fs] [patch] Skip empty directory entries during name 
o kern/151330  fs         [zfs] will unshare all zfs filesystem after execute a 
o kern/151326  fs         [nfs] nfs exports fail if netgroups contain duplicate 
o kern/151251  fs         [ufs] Can not create files on filesystem with heavy us
o kern/151226  fs         [zfs] can't delete zfs snapshot
o kern/151111  fs         [zfs] vnodes leakage during zfs unmount
o kern/150503  fs         [zfs] ZFS disks are UNAVAIL and corrupted after reboot
o kern/150501  fs         [zfs] ZFS vdev failure vdev.bad_label on amd64
o kern/150390  fs         [zfs] zfs deadlock when arcmsr reports drive faulted
o kern/150336  fs         [nfs] mountd/nfsd became confused; refused to reload n
o kern/150207  fs         zpool(1): zpool import -d /dev tries to open weird dev
o kern/149208  fs         mksnap_ffs(8) hang/deadlock
o kern/149173  fs         [patch] [zfs] make OpenSolaris <sys/nvpair.h> installa
o kern/149015  fs         [zfs] [patch] misc fixes for ZFS code to build on Glib
o kern/149014  fs         [zfs] [patch] declarations in ZFS libraries/utilities 
o kern/149013  fs         [zfs] [patch] make ZFS makefiles use the libraries fro
o kern/148504  fs         [zfs] ZFS' zpool does not allow replacing drives to be
o kern/148490  fs         [zfs]: zpool attach - resilver bidirectionally, and re
o kern/148368  fs         [zfs] ZFS hanging forever on 8.1-PRERELEASE
o bin/148296   fs         [zfs] [loader] [patch] Very slow probe in /usr/src/sys
o kern/148204  fs         [nfs] UDP NFS causes overload
o kern/148138  fs         [zfs] zfs raidz pool commands freeze
o kern/147903  fs         [zfs] [panic] Kernel panics on faulty zfs device
o kern/147881  fs         [zfs] [patch] ZFS "sharenfs" doesn't allow different "
o kern/147790  fs         [zfs] zfs set acl(mode|inherit) fails on existing zfs
o kern/147560  fs         [zfs] [boot] Booting 8.1-PRERELEASE raidz system take 
o kern/147420  fs         [ufs] [panic] ufs_dirbad, nullfs, jail panic (corrupt 
o kern/146941  fs         [zfs] [panic] Kernel Double Fault - Happens constantly
o kern/146786  fs         [zfs] zpool import hangs with checksum errors
o kern/146708  fs         [ufs] [panic] Kernel panic in softdep_disk_write_compl
o kern/146528  fs         [zfs] Severe memory leak in ZFS on i386
o kern/146502  fs         [nfs] FreeBSD 8 NFS Client Connection to Server
s kern/145712  fs         [zfs] cannot offline two drives in a raidz2 configurat
o kern/145411  fs         [xfs] [panic] Kernel panics shortly after mounting an 
o bin/145309   fs         bsdlabel: Editing disk label invalidates the whole dev
o kern/145272  fs         [zfs] [panic] Panic during boot when accessing zfs on 
o kern/145246  fs         [ufs] dirhash in 7.3 gratuitously frees hashes when it
o kern/145238  fs         [zfs] [panic] kernel panic on zpool clear tank
o kern/145229  fs         [zfs] Vast differences in ZFS ARC behavior between 8.0
o kern/145189  fs         [nfs] nfsd performs abysmally under load
o kern/144929  fs         [ufs] [lor] vfs_bio.c + ufs_dirhash.c
p kern/144447  fs         [zfs] sharenfs fsunshare() & fsshare_main() non functi
o kern/144416  fs         [panic] Kernel panic on online filesystem optimization
s kern/144415  fs         [zfs] [panic] kernel panics on boot after zfs crash
o kern/144234  fs         [zfs] Cannot boot machine with recent gptzfsboot code 
o kern/143825  fs         [nfs] [panic] Kernel panic on NFS client
o bin/143572   fs         [zfs] zpool(1): [patch] The verbose output from iostat
o kern/143212  fs         [nfs] NFSv4 client strange work ...
o kern/143184  fs         [zfs] [lor] zfs/bufwait LOR
o kern/142914  fs         [zfs] ZFS performance degradation over time
o kern/142878  fs         [zfs] [vfs] lock order reversal
o kern/142597  fs         [ext2fs] ext2fs does not work on filesystems with real
o kern/142489  fs         [zfs] [lor] allproc/zfs LOR
o kern/142466  fs         Update 7.2 -> 8.0 on Raid 1 ends with screwed raid [re
o kern/142306  fs         [zfs] [panic] ZFS drive (from OSX Leopard) causes two 
o kern/142068  fs         [ufs] BSD labels are got deleted spontaneously
o kern/141897  fs         [msdosfs] [panic] Kernel panic. msdofs: file name leng
o kern/141463  fs         [nfs] [panic] Frequent kernel panics after upgrade fro
o kern/141305  fs         [zfs] FreeBSD ZFS+sendfile severe performance issues (
o kern/141091  fs         [patch] [nullfs] fix panics with DIAGNOSTIC enabled
o kern/141086  fs         [nfs] [panic] panic("nfs: bioread, not dir") on FreeBS
o kern/141010  fs         [zfs] "zfs scrub" fails when backed by files in UFS2
o kern/140888  fs         [zfs] boot fail from zfs root while the pool resilveri
o kern/140661  fs         [zfs] [patch] /boot/loader fails to work on a GPT/ZFS-
o kern/140640  fs         [zfs] snapshot crash
o kern/140134  fs         [msdosfs] write and fsck destroy filesystem integrity
o kern/140068  fs         [smbfs] [patch] smbfs does not allow semicolon in file
o kern/139725  fs         [zfs] zdb(1) dumps core on i386 when examining zpool c
o kern/139715  fs         [zfs] vfs.numvnodes leak on busy zfs
p bin/139651   fs         [nfs] mount(8): read-only remount of NFS volume does n
o kern/139597  fs         [patch] [tmpfs] tmpfs initializes va_gen but doesn't u
o kern/139564  fs         [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo
o kern/139407  fs         [smbfs] [panic] smb mount causes system crash if remot
o kern/138662  fs         [panic] ffs_blkfree: freeing free block
o kern/138421  fs         [ufs] [patch] remove UFS label limitations
o kern/138202  fs         mount_msdosfs(1) see only 2Gb
o kern/136968  fs         [ufs] [lor] ufs/bufwait/ufs (open)
o kern/136945  fs         [ufs] [lor] filedesc structure/ufs (poll)
o kern/136944  fs         [ffs] [lor] bufwait/snaplk (fsync)
o kern/136873  fs         [ntfs] Missing directories/files on NTFS volume
o kern/136865  fs         [nfs] [patch] NFS exports atomic and on-the-fly atomic
p kern/136470  fs         [nfs] Cannot mount / in read-only, over NFS
o kern/135546  fs         [zfs] zfs.ko module doesn't ignore zpool.cache filenam
o kern/135469  fs         [ufs] [panic] kernel crash on md operation in ufs_dirb
o kern/135050  fs         [zfs] ZFS clears/hides disk errors on reboot
o kern/134491  fs         [zfs] Hot spares are rather cold...
o kern/133676  fs         [smbfs] [panic] umount -f'ing a vnode-based memory dis
o kern/133174  fs         [msdosfs] [patch] msdosfs must support utf-encoded int
o kern/132960  fs         [ufs] [panic] panic:ffs_blkfree: freeing free frag
o kern/132397  fs         reboot causes filesystem corruption (failure to sync b
o kern/132331  fs         [ufs] [lor] LOR ufs and syncer
o kern/132237  fs         [msdosfs] msdosfs has problems to read MSDOS Floppy
o kern/132145  fs         [panic] File System Hard Crashes
o kern/131441  fs         [unionfs] [nullfs] unionfs and/or nullfs not combineab
o kern/131360  fs         [nfs] poor scaling behavior of the NFS server under lo
o kern/131342  fs         [nfs] mounting/unmounting of disks causes NFS to fail
o bin/131341   fs         makefs: error "Bad file descriptor"  on the mount poin
o kern/130920  fs         [msdosfs] cp(1) takes 100% CPU time while copying file
o kern/130210  fs         [nullfs] Error by check nullfs
o kern/129760  fs         [nfs] after 'umount -f' of a stale NFS share FreeBSD l
o kern/129488  fs         [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: 
o kern/129231  fs         [ufs] [patch] New UFS mount (norandom) option - mostly
o kern/129152  fs         [panic] non-userfriendly panic when trying to mount(8)
o kern/127787  fs         [lor] [ufs] Three LORs: vfslock/devfs/vfslock, ufs/vfs
o bin/127270   fs         fsck_msdosfs(8) may crash if BytesPerSec is zero
o kern/127029  fs         [panic] mount(8): trying to mount a write protected zi
o kern/126287  fs         [ufs] [panic] Kernel panics while mounting an UFS file
o kern/125895  fs         [ffs] [panic] kernel: panic: ffs_blkfree: freeing free
s kern/125738  fs         [zfs] [request] SHA256 acceleration in ZFS
o kern/123939  fs         [msdosfs] corrupts new files
o kern/122380  fs         [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash
o bin/122172   fs         [fs]: amd(8) automount daemon dies on 6.3-STABLE i386,
o bin/121898   fs         [nullfs] pwd(1)/getcwd(2) fails with Permission denied
o bin/121366   fs         [zfs] [patch] Automatic disk scrubbing from periodic(8
o bin/121072   fs         [smbfs] mount_smbfs(8) cannot normally convert the cha
o kern/120483  fs         [ntfs] [patch] NTFS filesystem locking changes
o kern/120482  fs         [ntfs] [patch] Sync style changes between NetBSD and F
o kern/118912  fs         [2tb] disk sizing/geometry problem with large array
o kern/118713  fs         [minidump] [patch] Display media size required for a k
o bin/118249   fs         [ufs] mv(1): moving a directory changes its mtime
o kern/118107  fs         [ntfs] [panic] Kernel panic when accessing a file at N
o kern/117954  fs         [ufs] dirhash on very large directories blocks the mac
o bin/117315   fs         [smbfs] mount_smbfs(8) and related options can't mount
o kern/117314  fs         [ntfs] Long-filename only NTFS fs'es cause kernel pani
o kern/117158  fs         [zfs] zpool scrub causes panic if geli vdevs detach on
o bin/116980   fs         [msdosfs] [patch] mount_msdosfs(8) resets some flags f
o conf/116931  fs         lack of fsck_cd9660 prevents mounting iso images with 
o kern/116583  fs         [ffs] [hang] System freezes for short time when using 
f kern/116170  fs         [panic] Kernel panic when mounting /tmp
o bin/115361   fs         [zfs] mount(8) gets into a state where it won't set/un
o kern/114955  fs         [cd9660] [patch] [request] support for mask,dirmask,ui
o kern/114847  fs         [ntfs] [patch] [request] dirmask support for NTFS ala 
o kern/114676  fs         [ufs] snapshot creation panics: snapacct_ufs2: bad blo
o bin/114468   fs         [patch] [request] add -d option to umount(8) to detach
o kern/113852  fs         [smbfs] smbfs does not properly implement DFS referral
o bin/113838   fs         [patch] [request] mount(8): add support for relative p
o bin/113049   fs         [patch] [request] make quot(8) use getopt(3) and show 
o kern/112658  fs         [smbfs] [patch] smbfs and caching problems (resolves b
o kern/111843  fs         [msdosfs] Long Names of files are incorrectly created 
o kern/111782  fs         [ufs] dump(8) fails horribly for large filesystems
s bin/111146   fs         [2tb] fsck(8) fails on 6T filesystem
o kern/109024  fs         [msdosfs] [iconv] mount_msdosfs: msdosfs_iconv: Operat
o kern/109010  fs         [msdosfs] can't mv directory within fat32 file system
o bin/107829   fs         [2TB] fdisk(8): invalid boundary checking in fdisk / w
o kern/106107  fs         [ufs] left-over fsck_snapshot after unfinished backgro
f kern/106030  fs         [ufs] [panic] panic in ufs from geom when a dead disk 
o kern/104406  fs         [ufs] Processes get stuck in "ufs" state under persist
o kern/104133  fs         [ext2fs] EXT2FS module corrupts EXT2/3 filesystems
o kern/103035  fs         [ntfs] Directories in NTFS mounted disc images appear 
o kern/101324  fs         [smbfs] smbfs sometimes not case sensitive when it's s
o kern/99290   fs         [ntfs] mount_ntfs ignorant of cluster sizes
s bin/97498    fs         [request] newfs(8) has no option to clear the first 12
o kern/97377   fs         [ntfs] [patch] syntax cleanup for ntfs_ihash.c
o kern/95222   fs         [cd9660] File sections on ISO9660 level 3 CDs ignored
o kern/94849   fs         [ufs] rename on UFS filesystem is not atomic
o bin/94810    fs         fsck(8) incorrectly reports 'file system marked clean'
o kern/94769   fs         [ufs] Multiple file deletions on multi-snapshotted fil
o kern/94733   fs         [smbfs] smbfs may cause double unlock
o kern/93942   fs         [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D
o kern/92272   fs         [ffs] [hang] Filling a filesystem while creating a sna
o kern/91134   fs         [smbfs] [patch] Preserve access and modification time 
a kern/90815   fs         [smbfs] [patch] SMBFS with character conversions somet
o kern/88657   fs         [smbfs] windows client hang when browsing a samba shar
o kern/88555   fs         [panic] ffs_blkfree: freeing free frag on AMD 64
o kern/88266   fs         [smbfs] smbfs does not implement UIO_NOCOPY and sendfi
o bin/87966    fs         [patch] newfs(8): introduce -A flag for newfs to enabl
o kern/87859   fs         [smbfs] System reboot while umount smbfs.
o kern/86587   fs         [msdosfs] rm -r /PATH fails with lots of small files
o bin/85494    fs         fsck_ffs: unchecked use of cg_inosused macro etc.
o kern/80088   fs         [smbfs] Incorrect file time setting on NTFS mounted vi
o bin/74779    fs         Background-fsck checks one filesystem twice and omits 
o kern/73484   fs         [ntfs] Kernel panic when doing `ls` from the client si
o bin/73019    fs         [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino
o kern/71774   fs         [ntfs] NTFS cannot "see" files on a WinXP filesystem
o bin/70600    fs         fsck(8) throws files away when it can't grow lost+foun
o kern/68978   fs         [panic] [ufs] crashes with failing hard disk, loose po
o kern/65920   fs         [nwfs] Mounted Netware filesystem behaves strange
o kern/65901   fs         [smbfs] [patch] smbfs fails fsx write/truncate-down/tr
o kern/61503   fs         [smbfs] mount_smbfs does not work as non-root
o kern/55617   fs         [smbfs] Accessing an nsmb-mounted drive via a smb expo
o kern/51685   fs         [hang] Unbounded inode allocation causes kernel to loc
o kern/51583   fs         [nullfs] [patch] allow to work with devices and socket
o kern/36566   fs         [smbfs] System reboot with dead smb mount and umount
o kern/33464   fs         [ufs] soft update inconsistencies after system crash
o bin/27687    fs         fsck(8) wrapper is not properly passing options to fsc
o kern/18874   fs         [2TB] 32bit NFS servers export wrong negative values t

223 problems total.


From owner-freebsd-fs@FreeBSD.ORG  Mon May 16 14:18:30 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BC6B3106566C
	for <freebsd-fs@freebsd.org>; Mon, 16 May 2011 14:18:30 +0000 (UTC)
	(envelope-from universite@ukr.net)
Received: from otrada.od.ua (universite-1-pt.tunnel.tserv24.sto1.ipv6.he.net
	[IPv6:2001:470:27:140::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 2A8558FC17
	for <freebsd-fs@freebsd.org>; Mon, 16 May 2011 14:18:29 +0000 (UTC)
Received: from [IPv6:2001:470:28:140:c6b:1b7d:99a2:c4a2]
	([IPv6:2001:470:28:140:c6b:1b7d:99a2:c4a2]) (authenticated bits=0)
	by otrada.od.ua (8.14.4/8.14.4) with ESMTP id p4GEINSg043100
	for <freebsd-fs@freebsd.org>; Mon, 16 May 2011 17:18:24 +0300 (EEST)
	(envelope-from universite@ukr.net)
Message-ID: <4DD13225.6090802@ukr.net>
Date: Mon, 16 May 2011 17:18:13 +0300
From: "Vladislav V. Prodan" <universite@ukr.net>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; ru;
	rv:1.9.2.17) Gecko/20110414 Thunderbird/3.1.10
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Spam-Status: No, score=-92.0 required=5.0 tests=FREEMAIL_FROM,FSL_RU_URL,
	RDNS_NONE,SPF_SOFTFAIL,TO_NO_BRKTS_DIRECT,T_TO_NO_BRKTS_FREEMAIL,
	USER_IN_WHITELIST autolearn=no version=3.3.1
X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on
	mary-teresa.otrada.od.ua
X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7
	(otrada.od.ua [IPv6:2001:470:28:140::5]);
	Mon, 16 May 2011 17:18:28 +0300 (EEST)
Subject: The problem with backing up ZFS snapshots
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2011 14:18:30 -0000

I use a script that snapshots to backup with a working pool of the 
reserve pool of ZFS. https://gist.github.com/971271

  zroot/$fs -->> tank/backup/zroot/$fs
# zfs list | grep mysql
tank/backup/zroot/mysql           2,21G   843G   612M  /backup/zroot/mysql
zroot/mysql                       2,12G   438G  2,07G  /var/db/mysql
zroot/mysql/ibdata                10,3M   438G  10,0M  /var/db/mysql/ibdata
zroot/mysql/iblogs                11,2M   438G  10,0M  /var/db/mysql/iblogs

When I copy the partition /mysql, without embedded zroot/mysql/ibdata 
and zroot/mysql/iblogs, they fall off.
[23:09]mary-teresa:root->db/mysql# ll | more
total 2134129
drwx------  2 mysql  mysql        12  3 май 00:12 auth
drwx------  2 mysql  mysql       147  3 май 00:12 cacti
drwxr-xr-x  2 root   wheel         2 20 апр 00:19 ibdata
drwxr-xr-x  2 root   wheel         2 20 апр 00:19 iblogs

Only helps the manual removal of empty directories ibdata and iblogs and 
unmounting these filesystems and reassembly::
zfs umount zroot/mysql/ibdata
zfs umount zroot/mysql/iblogs
zfs mount -a


# FreeBSD 8.2-STABLE #0: Wed Apr 20 03:20:47 EEST 2011  amd64


-- 
Vladislav V. Prodan
VVP24-UANIC
+380[67]4584408
+380[99]4060508
vlad11@jabber.ru

From owner-freebsd-fs@FreeBSD.ORG  Mon May 16 23:58:36 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C2AFD106564A
	for <freebsd-fs@freebsd.org>; Mon, 16 May 2011 23:58:36 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 7E6648FC14
	for <freebsd-fs@freebsd.org>; Mon, 16 May 2011 23:58:36 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Ap0EAOi50U2DaFvO/2dsb2JhbAAwhCmiNacJjiCRNoErgWyBe4EHBJARhyuHZg
X-IronPort-AV: E=Sophos;i="4.65,222,1304308800"; d="scan'208";a="120899282"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 16 May 2011 19:58:35 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 37C4DB3EB1
	for <freebsd-fs@freebsd.org>; Mon, 16 May 2011 19:58:35 -0400 (EDT)
Date: Mon, 16 May 2011 19:58:35 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: FreeBSD FS <freebsd-fs@freebsd.org>
Message-ID: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692)
Subject: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2011 23:58:36 -0000

Hi,

Down the road, I would like the NFS server to be able to do a
  VFS_FHTOVP(mp, &fhp->fh_fid, LK_SHARED, vpp);
similar to what is already supported for VFS_VGET(). The reason
is that, currently, when a client does read-aheads, these reads are
basically serialized because the VFS_FHTOVP() gets an LK_EXCLUSIVE
locked vnode for each RPC on the server.

Like VFS_VGET(), the underlying file system can still choose to
return a LK_EXCLUSIVE locked vnode even when LK_SHARED is specified.
(Some file systems, such as FFS, just call VFS_VGET() in VFS_FHTOVP(),
 so all that happens is that the flag is passed through to VFS_VGET()
 for those ones.)

To minimize the risk of the patch breaking something, I have it setting
LK_EXCLUSIVE for all VFS_FHTOVP() calls so that the semantics don't
actually change. (Changing the NFS server to use LK_SHARED is a trivial
patch, but will need extensive testing, so I'm not planning on that
change for 9.0.)

If you are interested, my current patch is at:
  http://people.freebsd.org/~rmacklem/fhtovp.patch

So, does this sound like a reasonable thing to commit, once the patch
is reviewed?

rick

From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 09:20:20 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 1BFEB1065670
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 09:20:20 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id 8BBA38FC08
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 09:20:18 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p4H9KB2t051154
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Tue, 17 May 2011 12:20:11 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	p4H9KBDh042104; Tue, 17 May 2011 12:20:11 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p4H9KBnK042103; 
	Tue, 17 May 2011 12:20:11 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Tue, 17 May 2011 12:20:11 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20110517092011.GK48734@deviant.kiev.zoral.com.ua>
References: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="tt/ITYRGFe52qw7n"
Content-Disposition: inline
In-Reply-To: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 09:20:20 -0000


--tt/ITYRGFe52qw7n
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Mon, May 16, 2011 at 07:58:35PM -0400, Rick Macklem wrote:
> Hi,
>=20
> Down the road, I would like the NFS server to be able to do a
>   VFS_FHTOVP(mp, &fhp->fh_fid, LK_SHARED, vpp);
> similar to what is already supported for VFS_VGET(). The reason
> is that, currently, when a client does read-aheads, these reads are
> basically serialized because the VFS_FHTOVP() gets an LK_EXCLUSIVE
> locked vnode for each RPC on the server.
>=20
> Like VFS_VGET(), the underlying file system can still choose to
> return a LK_EXCLUSIVE locked vnode even when LK_SHARED is specified.
> (Some file systems, such as FFS, just call VFS_VGET() in VFS_FHTOVP(),
>  so all that happens is that the flag is passed through to VFS_VGET()
>  for those ones.)
Yes, the flag to specify the locking mode does only specify the minimal
locking requirements, and filesystem is allowed to upgrade it to the
more strict lock type. E.g. UFS would only return shared lock if the
vnode was found in hash, AFAIR. If not told otherwise, getnewvnode(9)
forces lockmgr to convert all lock requests into exclusive.

>=20
> To minimize the risk of the patch breaking something, I have it setting
> LK_EXCLUSIVE for all VFS_FHTOVP() calls so that the semantics don't
> actually change. (Changing the NFS server to use LK_SHARED is a trivial
> patch, but will need extensive testing, so I'm not planning on that
> change for 9.0.)
>=20
> If you are interested, my current patch is at:
>   http://people.freebsd.org/~rmacklem/fhtovp.patch
>=20
> So, does this sound like a reasonable thing to commit, once the patch
> is reviewed?
Sure, please do it before the code slush.

--tt/ITYRGFe52qw7n
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk3SPcoACgkQC3+MBN1Mb4iLDgCgzybKN06HfY7h5tMg2BxX+iVh
KLkAnij6Gjq5oy6+vRqQHO4ZOwHWKpBC
=O73t
-----END PGP SIGNATURE-----

--tt/ITYRGFe52qw7n--

From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 09:36:45 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 35B5D1065674
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 09:36:45 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id E138D8FC15
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 09:36:44 +0000 (UTC)
Received: by qwc9 with SMTP id 9so193173qwc.13
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 02:36:44 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc
	:content-type; bh=MhIlTYRbAgpXtz0FsJldYRkOt/V0yFQAxpCnwUU8SA0=;
	b=FogjUcFlcMxoTutpp8l1/h8ECSlR5fDtlLsPwz1O8orZ42WB0D27y9aXzQPItE1H1e
	SrltSsE5WnVFlAOml12zBADCpSp7RPG6CIjFDOXIFEqj7tCtNs58qZPQDi1Chle6dDrA
	Tj91ktgV0dVbxavYztlw9jMNxGfboD7OAfqeI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:cc:content-type;
	b=qUiHd6hQ2rWodOh//M6XCbk2ll1zi2TvoO1i0cDaWVGSDli87fktw1ryqUrsc3yS7E
	X26fNewIqg5ew8AmGOug4HWd0xgJzcaq5UTRlOfNetre5I/JLmtt8GgcDMGqIi4G/MvB
	vYjzDVLdHMVeu6xCbz/N0KH4/zR+vzy75CqyA=
MIME-Version: 1.0
Received: by 10.229.181.142 with SMTP id by14mr267219qcb.247.1305625003915;
	Tue, 17 May 2011 02:36:43 -0700 (PDT)
Received: by 10.229.111.218 with HTTP; Tue, 17 May 2011 02:36:43 -0700 (PDT)
Date: Tue, 17 May 2011 13:36:43 +0400
Message-ID: <BANLkTi=g=Y8=Ma=N1Ma_9-D0nP+673ay9Q@mail.gmail.com>
From: Sergey Kandaurov <pluknet@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
Subject: [old nfsclient] different nmount() args passed from mount vs.
	mount_nfs
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 09:36:45 -0000

Hi.

First, sorry for the long mail. I just tried to describe in full details.

When mounting nfs with some options, I found that /sbin/mount and
/sbin/mount_nfs pass options to nmount() differently, which results
in bad things (TM). I traced the options and here they are:

>From mount(8) -> mount_nfs(8):
"rw"       -> ""
"addr"     -> {something valid }
"fh"       -> 5
"sec"      -> "sys"
"nfsv3"    -> 0x0 => NFSMNT_NFSV3
"hostname" -> "dev2.mail:/home/svn/freebsd/head"
"fstype"   -> "oldnfs"
"fspath"   -> "/usr/src"
"errmsg"   -> ""
(nil)

>From pre-r221124 mount(8):
= "fstype"   -> "oldnfs"
"hostname" -> "dev2.mail"
= "fspath"   -> "/usr/src"
"from"     -> "dev2.mail:/home/svn/freebsd/head"
= "errmsg"   -> ""
(nil)

Note, that pre-r221124 mount(8) knows nothing about oldnfs.

1. "hostname" option is passed differently from mount(8) and mount_nfs(8).
When I force to mount oldnfs file system with mount(8) directly (to not
bypass the nmount(2) call to mount_nfs(8)), I get this error:
./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument

Hmm.. this may be because mount(8) passes value in $hostname:$path format
(see the traces above). It might be due to different old nfsclient way to parse
args, but I am not sure, I can be wrong. Anyway, it does not matter now.

The actual problem manifests when running the command with pre-r221124
mount(8) binary. It knows nothing about "oldnfs" and (attention!)
calls nmount(2)
directly instead of bypassing the call to the mount_nfs(8) binary as
usually done,
and this is the place where the "unsanitized nmount(2) args" problem is hidden.
[New mount knows about "oldnfs" and passes the call to mount_oldnfs(8) that
prepares all the nmount(2) args to correctly hide the problem.]

To prove it, that is how old and new mount(8) work differently:
1) new mount(8) as of current
mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
2) old mount(8) as of pre-r221124
./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src


Ok, back to the first paragraph: a different "hostname" mount option.
When I first faced with this, I tried to specify value for "hostname"
explicitly. Here it comes:
./mount -t oldnfs -o hostname=dev2.mail
dev2.mail:/home/svn/freebsd/head /usr/src
[CABOOM!]
It just crashed. Do not do this :)

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x1
fault code              = supervisor read data, page not present
instruction pointer     = 0x20:0xffffffff805da299
stack pointer           = 0x28:0xffffff807bef6240
frame pointer           = 0x28:0xffffff807bef62a0
code segment            = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags        = interrupt enabled, resume, IOPL = 0
current process         = 2541 (mount)
db> bt
Tracing pid 2541 tid 100076 td 0xfffffe0001ace460
nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79
nfs_request() at 0xffffffff805da978 = nfs_request+0x398
nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc
VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3
mountnfs() at 0xffffffff805de739 = mountnfs+0x329
nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7
vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f
nmount() at 0xffffffff804d54f3 = nmount+0x63
syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb
syscall() at 0xffffffff806ae710 = syscall+0x60
Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp =
0x7fffffffca48, rbp = 0x801009058 ---


As you might see from above nmount(2) args traces, mount(8) itself doesn't
pass the "addr" option to the nmount(2) syscall while nfs_mount() expects to
receive it, which is the problem.
Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to dereference
addr value and page faults here in nfs_connect() :

                vers = NFS_VER3;
        else if (nmp->nm_flag & NFSMNT_NFSV4)
                vers = NFS_VER4;
XXX saddr is NULL, the next line will crash
        if (saddr->sa_family == AF_INET)
                if (nmp->nm_sotype == SOCK_DGRAM)
                        nconf = getnetconfigent("udp");

I think that nfsclient, probably in sys/nfsclient/nfs_vfsops.c:mount_nfs(),
should handle a missing value for "addr" and/or "fh" mount options.
It doesn't check it currently:

% static int
% nfs_mount(struct mount *mp)
% {
%         struct nfs_args args = {
% [...]
%             .addr = NULL,
%         };
%        int error, ret, has_nfs_args_opt;
%        int has_addr_opt, has_fh_opt, has_hostname_opt;
%        struct sockaddr *nam;

addr is initialized with NULL. num used later as a pointer to args.addr value.

%        if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) {
%                error = nfs_mountroot(mp);
%                goto out;
%        }

We do not try to mount root, this is not ours.

%        if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) {
[...]
%                has_nfs_args_opt = 1;
%        }

We do not use old mount(2) interface, not ours.

%        if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0)
%                args.flags |= NFSMNT_NFSV3;

mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set.

%        if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr,
%                &args.addrlen) == 0) {
%                has_addr_opt = 1;
%                if (args.addrlen > SOCK_MAXADDRLEN) {
%                        error = ENAMETOOLONG;
%                        goto out;
%                }
%                nam = malloc(args.addrlen, M_SONAME,
%                    M_WAITOK);
%                bcopy(args.addr, nam, args.addrlen);
%                nam->sa_len = args.addrlen;
%        }

mount(8) doesn't pass addr option, so args.addr isn't set, hence
struct sockaddr *nam is also NULL, has_addr_opt is 0.

%        if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname,
%                NULL) == 0) {
%                has_hostname_opt = 1;
%        }
%        if (args.hostname == NULL) {
%                vfs_mount_error(mp, "Invalid hostname");
%                error = EINVAL;
%                goto out;
%        }

I don't know why I got here the error. I didn't analyze it deep though.
"mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid argument"

%        if (mp->mnt_flag & MNT_UPDATE) {
[...]

That's not update case, it's not ours.

%        if (has_nfs_args_opt) {

has_nfs_args_opt is 0, as we don't use legacy mount(2) interface, see above.
So, the whole block is ignored. Though, see below.

%                /*
%                 * In the 'nfs_args' case, the pointers in the args
%                 * structure are in userland - we copy them in here.
%                 */
%                if (!has_fh_opt) {
%                        error = copyin((caddr_t)args.fh, (caddr_t)nfh,
%                            args.fhsize);
%                        if (error) {
%                                goto out;
%                        }
%                        args.fh = nfh;
%                }

has_fh_opt is 0, as mount(8) didn't pass "fh" to nmount(2),
though this part is not executed anyway.

%                if (!has_hostname_opt) {
%                        error = copyinstr(args.hostname, hst, MNAMELEN-1, &len)
%                        if (error) {
%                                goto out;
%                        }
%                        bzero(&hst[len], MNAMELEN - len);
%                        args.hostname = hst;

has_hostname_opt is 1, as mount(8) passes "hostname" to nmount(2),
though this part is not executed anyway.

%                }
%                if (!has_addr_opt) {
%                        /* sockargs() call must be after above copyin() calls *
%                        printf("args.addr: %p\n", args.addr);
%                        error = getsockaddr(&nam, (caddr_t)args.addr,
%                            args.addrlen);
%                        printf("error: %d\n", error);
%                        if (error) {
%                                goto out;
%                        }
%                }

has_addr_opt is 0, as mount(8) didn't pass "addr" to nmount(2),
though this part is not executed anyway.

%        }
%        error = mountnfs(&args, mp, nam, args.hostname, &vp,
%            curthread->td_ucred, negnametimeo);

mountnfs() is called with nam == NULL, then it crashes deep in
/sys/nfsclient/nfs_krpc.c:nfs_connect().


Also compare ddb backtrace with one from new mount(8)
which bypasses the call to mount_nfs(8). I got it by adding
kdb_enter() just before NULL pointer dereference.

db> bt
Tracing pid 2143 tid 100117 td 0xfffffe0001c58000
kdb_enter() at 0xffffffff80477d1b = kdb_enter+0x3b
nfs_connect() at 0xffffffff805da7e8 = nfs_connect+0x88
nfs_request() at 0xffffffff805daec8 = nfs_request+0x398
nfs_fsinfo() at 0xffffffff805ddec0 = nfs_fsinfo+0xd0
mountnfs() at 0xffffffff805ded44 = mountnfs+0x3e4
nfs_mount() at 0xffffffff805e051f = nfs_mount+0xcff
vfs_donmount() at 0xffffffff804d5092 = vfs_donmount+0xc92
nmount() at 0xffffffff804d5a33 = nmount+0x63
syscallenter() at 0xffffffff804866eb = syscallenter+0x1cb
syscall() at 0xffffffff806aec90 = syscall+0x60
Xfast_syscall() at 0xffffffff806997ad = Xfast_syscall+0xdd
--- syscall (378, FreeBSD ELF64, nmount), rip = 0x8008a544c, rsp =
0x7fffffffd258, rbp = 0x7fffffffd30c ---


Two backtraces different slightly because of NFSMNT_NFSV3 is not set
in the old mount(8) case. From sys/nfsclient/nfs_vfsops.c:mountnfs()

        if (argp->flags & NFSMNT_NFSV3)
                nfs_fsinfo(nmp, *vpp, curthread->td_ucred, curthread);
        else
                VOP_GETATTR(*vpp, &attrs, curthread->td_ucred);


-- 
wbr,
pluknet

From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 19:33:54 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 328A11065674
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 19:33:54 +0000 (UTC)
	(envelope-from a.smith@ukgrid.net)
Received: from mx1.ukgrid.net (mx1.ukgrid.net [89.107.22.36])
	by mx1.freebsd.org (Postfix) with ESMTP id F1D628FC21
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 19:33:53 +0000 (UTC)
Received: from [89.21.28.38] (port=39435 helo=omicron.ukgrid.net)
	by mx1.ukgrid.net with esmtp (Exim 4.74; FreeBSD)
	envelope-from a.smith@ukgrid.net envelope-to freebsd-fs@freebsd.org
	id 1QMPe8-000Kbw-5p; Tue, 17 May 2011 20:09:32 +0100
Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com
	[81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP;
	Tue, 17 May 2011 20:09:32 +0100
Message-ID: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
Date: Tue, 17 May 2011 20:09:32 +0100
From: a.smith@ukgrid.net
To: freebsd-fs@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1
Subject: zfs get all command hung
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 19:33:54 -0000

Hi,

   I have a script that runs every hour, one of the commands it runs  
is "zfs get all mypool". The process has hung and cannot be killed. Is  
there anything I can do to work out what happened? This has happened  
before, but on older OS releases. The system is FreeBSD 8.2-RELEASE  
amd64. A truss of the process just shows nothing,

thanks Andy.


From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 19:35:27 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4F011106566B
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 19:35:27 +0000 (UTC)
	(envelope-from a.smith@ukgrid.net)
Received: from mx0.ukgrid.net (mx0.ukgrid.net [89.21.28.41])
	by mx1.freebsd.org (Postfix) with ESMTP id 13AC28FC18
	for <freebsd-fs@freebsd.org>; Tue, 17 May 2011 19:35:26 +0000 (UTC)
Received: from [89.21.28.38] (port=11959 helo=omicron.ukgrid.net)
	by mx0.ukgrid.net with esmtp (Exim 4.74; FreeBSD)
	envelope-from a.smith@ukgrid.net envelope-to freebsd-fs@freebsd.org
	id 1QMPga-000C4y-Cz; Tue, 17 May 2011 20:12:04 +0100
Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com
	[81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP;
	Tue, 17 May 2011 20:12:03 +0100
Message-ID: <20110517201203.1813683kuqivzwws@webmail2.ukgrid.net>
Date: Tue, 17 May 2011 20:12:03 +0100
From: a.smith@ukgrid.net
To: freebsd-fs@freebsd.org
References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
In-Reply-To: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1
Subject: Re: zfs get all command hung
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 19:35:27 -0000

PS the pool is live and up and running read/write, just any zfs get  
command is hanging...

Quoting a.smith@ukgrid.net:

> Hi,
>
>   I have a script that runs every hour, one of the commands it runs  
> is "zfs get all mypool". The process has hung and cannot be killed.  
> Is there anything I can do to work out what happened? This has  
> happened before, but on older OS releases. The system is FreeBSD  
> 8.2-RELEASE amd64. A truss of the process just shows nothing,
>
> thanks Andy.
>
>
>
>
>


From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 20:41:54 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2E03F106564A
	for <freebsd-fs@FreeBSD.org>; Tue, 17 May 2011 20:41:54 +0000 (UTC)
	(envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 5A4A58FC20
	for <freebsd-fs@FreeBSD.org>; Tue, 17 May 2011 20:41:53 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id XAA03690;
	Tue, 17 May 2011 23:23:31 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1QMQnj-0007eK-CS; Tue, 17 May 2011 23:23:31 +0300
Message-ID: <4DD2D942.9030600@FreeBSD.org>
Date: Tue, 17 May 2011 23:23:30 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.17) Gecko/20110503 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: a.smith@ukgrid.net
References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
In-Reply-To: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: zfs get all command hung
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 20:41:54 -0000

on 17/05/2011 22:09 a.smith@ukgrid.net said the following:
> Hi,
> 
>   I have a script that runs every hour, one of the commands it runs is "zfs get
> all mypool". The process has hung and cannot be killed. Is there anything I can
> do to work out what happened? This has happened before, but on older OS
> releases. The system is FreeBSD 8.2-RELEASE amd64. A truss of the process just
> shows nothing,

procstat -kk <pid>

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 20:54:13 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D0E731065674
	for <freebsd-fs@FreeBSD.org>; Tue, 17 May 2011 20:54:13 +0000 (UTC)
	(envelope-from a.smith@ukgrid.net)
Received: from mx0.ukgrid.net (mx0.ukgrid.net [89.21.28.41])
	by mx1.freebsd.org (Postfix) with ESMTP id 8F2618FC0C
	for <freebsd-fs@FreeBSD.org>; Tue, 17 May 2011 20:54:13 +0000 (UTC)
Received: from [89.21.28.38] (port=46293 helo=omicron.ukgrid.net)
	by mx0.ukgrid.net with esmtp (Exim 4.74; FreeBSD)
	envelope-from a.smith@ukgrid.net
	id 1QMRHQ-00030o-EL; Tue, 17 May 2011 21:54:12 +0100
Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com
	[81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP;
	Tue, 17 May 2011 21:54:12 +0100
Message-ID: <20110517215412.879621won3gxj4v4@webmail2.ukgrid.net>
Date: Tue, 17 May 2011 21:54:12 +0100
From: a.smith@ukgrid.net
To: Andriy Gapon <avg@FreeBSD.org>
References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
	<4DD2D942.9030600@FreeBSD.org>
In-Reply-To: <4DD2D942.9030600@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1
Cc: freebsd-fs@FreeBSD.org
Subject: Re: zfs get all command hung
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 20:54:13 -0000

Quoting Andriy Gapon <avg@FreeBSD.org>:
>
> procstat -kk <pid>
>

# procstat -kk 37975
   PID    TID COMM             TDNAME           KSTACK
37975 100669 zfs              -                mi_switch+0x176  
sleepq_catch_signals+0x29e sleepq_wait_sig+0x16 _sleep+0x269  
clnt_vc_create+0x153 clnt_reconnect_call+0x64d nfs_request+0x215  
nfs_statfs+0x194 __vfs_statfs+0x28 kern_getfsstat+0x3fc  
syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2

And actually I was thinking, as the all zfs get commands are hanging,  
I can run others and truss them of course. Here is the tail of a truss:


NAME  PROPERTY              VALUE                  SOURCE
write(1,"NAME  PROPERTY              VALU"...,58) = 58 (0x3a)
mx1   type                  filesystem             -
write(1,"mx1   type                  file"...,53) = 53 (0x35)
mx1   creation              Mon Jan 17 12:08 2011  -
write(1,"mx1   creation              Mon "...,53) = 53 (0x35)
mx1   used                  78.2G                  -
write(1,"mx1   used                  78.2"...,53) = 53 (0x35)
mx1   available             195G                   -
write(1,"mx1   available             195G"...,53) = 53 (0x35)
mx1   referenced            22K                    -
write(1,"mx1   referenced            22K "...,53) = 53 (0x35)
mx1   compressratio         1.27x                  -
write(1,"mx1   compressratio         1.27"...,53) = 53 (0x35)
fstat(4,{ mode=crw-rw-rw- ,inode=32,size=0,blksize=4096 }) = 0 (0x0)
ioctl(4,TIOCGETA,0xffffc8c0)                     ERR#19 'Operation not  
supported by device'
lseek(4,0x0,SEEK_SET)                            = 0 (0x0)
lseek(4,0x0,SEEK_CUR)                            = 0 (0x0)
getfsstat(0x0,0x0,0x1,0x0,0x80,0xa008)           = 443 (0x1bb)

Andy.


From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 21:17:17 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4FC1F1065672;
	Tue, 17 May 2011 21:17:17 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 1FFD48FC1A;
	Tue, 17 May 2011 21:17:17 +0000 (UTC)
Received: from gjp by noop.in-addr.com with local (Exim 4.76 (FreeBSD))
	(envelope-from <gpalmer@freebsd.org>)
	id 1QMRdk-000K3p-4E; Tue, 17 May 2011 17:17:16 -0400
Date: Tue, 17 May 2011 17:17:16 -0400
From: Gary Palmer <gpalmer@freebsd.org>
To: a.smith@ukgrid.net
Message-ID: <20110517211716.GD37035@in-addr.com>
References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
	<4DD2D942.9030600@FreeBSD.org>
	<20110517215412.879621won3gxj4v4@webmail2.ukgrid.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110517215412.879621won3gxj4v4@webmail2.ukgrid.net>
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: gpalmer@freebsd.org
X-SA-Exim-Scanned: No (on noop.in-addr.com); SAEximRunCond expanded to false
Cc: freebsd-fs@FreeBSD.org, Andriy Gapon <avg@FreeBSD.org>
Subject: Re: zfs get all command hung
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 21:17:17 -0000

On Tue, May 17, 2011 at 09:54:12PM +0100, a.smith@ukgrid.net wrote:
> Quoting Andriy Gapon <avg@FreeBSD.org>:
> >
> >procstat -kk <pid>
> >
> 
> # procstat -kk 37975
>   PID    TID COMM             TDNAME           KSTACK
> 37975 100669 zfs              -                mi_switch+0x176  
> sleepq_catch_signals+0x29e sleepq_wait_sig+0x16 _sleep+0x269  
> clnt_vc_create+0x153 clnt_reconnect_call+0x64d nfs_request+0x215  
> nfs_statfs+0x194 __vfs_statfs+0x28 kern_getfsstat+0x3fc  
> syscallenter+0x1e5 syscall+0x4b Xfast_syscall+0xe2
> 
> And actually I was thinking, as the all zfs get commands are hanging,  
> I can run others and truss them of course. Here is the tail of a truss:
> 
> 
> NAME  PROPERTY              VALUE                  SOURCE
> write(1,"NAME  PROPERTY              VALU"...,58) = 58 (0x3a)
> mx1   type                  filesystem             -
> write(1,"mx1   type                  file"...,53) = 53 (0x35)
> mx1   creation              Mon Jan 17 12:08 2011  -
> write(1,"mx1   creation              Mon "...,53) = 53 (0x35)
> mx1   used                  78.2G                  -
> write(1,"mx1   used                  78.2"...,53) = 53 (0x35)
> mx1   available             195G                   -
> write(1,"mx1   available             195G"...,53) = 53 (0x35)
> mx1   referenced            22K                    -
> write(1,"mx1   referenced            22K "...,53) = 53 (0x35)
> mx1   compressratio         1.27x                  -
> write(1,"mx1   compressratio         1.27"...,53) = 53 (0x35)
> fstat(4,{ mode=crw-rw-rw- ,inode=32,size=0,blksize=4096 }) = 0 (0x0)
> ioctl(4,TIOCGETA,0xffffc8c0)                     ERR#19 'Operation not  
> supported by device'
> lseek(4,0x0,SEEK_SET)                            = 0 (0x0)
> lseek(4,0x0,SEEK_CUR)                            = 0 (0x0)
> getfsstat(0x0,0x0,0x1,0x0,0x80,0xa008)           = 443 (0x1bb)


I'm no expert, but it looks more like you have a NFS filesystem mounted
on the system and for some reason system calls to list the mounted
filesystems are hanging due to the NFS mount.  Is there a NFS filesystem
mounted on that box and is the NFS server available and responding to
NFS requests?

Gary

From owner-freebsd-fs@FreeBSD.ORG  Tue May 17 21:33:06 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 471D1106564A;
	Tue, 17 May 2011 21:33:06 +0000 (UTC)
	(envelope-from a.smith@ukgrid.net)
Received: from mx1.ukgrid.net (mx1.ukgrid.net [89.107.22.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 0A7F48FC12;
	Tue, 17 May 2011 21:33:05 +0000 (UTC)
Received: from [89.21.28.38] (port=51540 helo=omicron.ukgrid.net)
	by mx1.ukgrid.net with esmtp (Exim 4.74; FreeBSD)
	envelope-from a.smith@ukgrid.net
	id 1QMRt3-000GnQ-3Y; Tue, 17 May 2011 22:33:05 +0100
Received: from 81.60.137.91.dyn.user.ono.com (81.60.137.91.dyn.user.ono.com
	[81.60.137.91]) by webmail2.ukgrid.net (Horde Framework) with HTTP;
	Tue, 17 May 2011 22:33:04 +0100
Message-ID: <20110517223304.10337hhl7w2hz4g8@webmail2.ukgrid.net>
Date: Tue, 17 May 2011 22:33:04 +0100
From: a.smith@ukgrid.net
To: Gary Palmer <gpalmer@freebsd.org>
References: <20110517200932.33075laonl99lx4w@webmail2.ukgrid.net>
	<4DD2D942.9030600@FreeBSD.org>
	<20110517215412.879621won3gxj4v4@webmail2.ukgrid.net>
	<20110517211716.GD37035@in-addr.com>
In-Reply-To: <20110517211716.GD37035@in-addr.com>
MIME-Version: 1.0
Content-Type: text/plain;
 charset=ISO-8859-1;
 DelSp="Yes";
 format="flowed"
Content-Disposition: inline
Content-Transfer-Encoding: 7bit
User-Agent: Internet Messaging Program (IMP) H3 (4.3.9) / FreeBSD-8.1
Cc: freebsd-fs@FreeBSD.org, Andriy Gapon <avg@FreeBSD.org>
Subject: Re: zfs get all command hung
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 17 May 2011 21:33:06 -0000

Quoting Gary Palmer <gpalmer@freebsd.org>:

>
> I'm no expert, but it looks more like you have a NFS filesystem mounted
> on the system and for some reason system calls to list the mounted
> filesystems are hanging due to the NFS mount.  Is there a NFS filesystem
> mounted on that box and is the NFS server available and responding to
> NFS requests?
>

Hi Gary,

   yeah think you're spot on there! There is an NFS mount used for  
some backups, looks like our network guys have broken something today  
though, seems to be blocked on the firewall!

thanks for the comment,

Andy.


From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 00:37:18 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F2F8C106564A
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 00:37:18 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-jnhn.mail.uoguelph.ca (esa-jnhn.mail.uoguelph.ca
	[131.104.91.44])
	by mx1.freebsd.org (Postfix) with ESMTP id 8E80D8FC12
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 00:37:18 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApwEAJET002DaFvO/2dsb2JhbACEWaI0iHCtWpB/hRKBBwSQEYcrh2Y
X-IronPort-AV: E=Sophos;i="4.65,228,1304308800"; d="scan'208";a="125034784"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-jnhn-pri.mail.uoguelph.ca with ESMTP; 17 May 2011 20:37:17 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 74164B3F28;
	Tue, 17 May 2011 20:37:17 -0400 (EDT)
Date: Tue, 17 May 2011 20:37:17 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: Sergey Kandaurov <pluknet@gmail.com>
Message-ID: <713535812.490291.1305679037413.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <BANLkTi=g=Y8=Ma=N1Ma_9-D0nP+673ay9Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/mixed; 
	boundary="----=_Part_490290_2136409836.1305679037410"
X-Originating-IP: [172.17.91.202]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - IE7 (Win)/6.0.10_GA_2692)
Cc: freebsd-fs@freebsd.org
Subject: Re: [old nfsclient] different nmount() args passed from mount vs.
 mount_nfs
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 00:37:19 -0000

------=_Part_490290_2136409836.1305679037410
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

> Hi.
> 
> First, sorry for the long mail. I just tried to describe in full
> details.
> 
> When mounting nfs with some options, I found that /sbin/mount and
> /sbin/mount_nfs pass options to nmount() differently, which results
> in bad things (TM). I traced the options and here they are:
> 
> From mount(8) -> mount_nfs(8):
> "rw" -> ""
> "addr" -> {something valid }
> "fh" -> 5
> "sec" -> "sys"
> "nfsv3" -> 0x0 => NFSMNT_NFSV3
> "hostname" -> "dev2.mail:/home/svn/freebsd/head"
> "fstype" -> "oldnfs"
> "fspath" -> "/usr/src"
> "errmsg" -> ""
> (nil)
> 
> From pre-r221124 mount(8):
> = "fstype" -> "oldnfs"
> "hostname" -> "dev2.mail"
> = "fspath" -> "/usr/src"
> "from" -> "dev2.mail:/home/svn/freebsd/head"
> = "errmsg" -> ""
> (nil)
> 
> Note, that pre-r221124 mount(8) knows nothing about oldnfs.
> 
> 1. "hostname" option is passed differently from mount(8) and
> mount_nfs(8).
> When I force to mount oldnfs file system with mount(8) directly (to
> not
> bypass the nmount(2) call to mount_nfs(8)), I get this error:
> ./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
> mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid
> argument
> 
> Hmm.. this may be because mount(8) passes value in $hostname:$path
> format
> (see the traces above). It might be due to different old nfsclient way
> to parse
> args, but I am not sure, I can be wrong. Anyway, it does not matter
> now.
> 
> The actual problem manifests when running the command with pre-r221124
> mount(8) binary. It knows nothing about "oldnfs" and (attention!)
> calls nmount(2)
> directly instead of bypassing the call to the mount_nfs(8) binary as
> usually done,
> and this is the place where the "unsanitized nmount(2) args" problem
> is hidden.
> [New mount knows about "oldnfs" and passes the call to mount_oldnfs(8)
> that
> prepares all the nmount(2) args to correctly hide the problem.]
> 
> To prove it, that is how old and new mount(8) work differently:
> 1) new mount(8) as of current
> mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
> exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
> 2) old mount(8) as of pre-r221124
> ./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
> mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
> 
> 
> Ok, back to the first paragraph: a different "hostname" mount option.
> When I first faced with this, I tried to specify value for "hostname"
> explicitly. Here it comes:
> ./mount -t oldnfs -o hostname=dev2.mail
> dev2.mail:/home/svn/freebsd/head /usr/src
> [CABOOM!]
> It just crashed. Do not do this :)
> 
> Fatal trap 12: page fault while in kernel mode
> cpuid = 0; apic id = 00
> fault virtual address = 0x1
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff805da299
> stack pointer = 0x28:0xffffff807bef6240
> frame pointer = 0x28:0xffffff807bef62a0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 2541 (mount)
> db> bt
> Tracing pid 2541 tid 100076 td 0xfffffe0001ace460
> nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79
> nfs_request() at 0xffffffff805da978 = nfs_request+0x398
> nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc
> VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3
> mountnfs() at 0xffffffff805de739 = mountnfs+0x329
> nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7
> vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f
> nmount() at 0xffffffff804d54f3 = nmount+0x63
> syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb
> syscall() at 0xffffffff806ae710 = syscall+0x60
> Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd
> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp =
> 0x7fffffffca48, rbp = 0x801009058 ---
> 
> 
> As you might see from above nmount(2) args traces, mount(8) itself
> doesn't
> pass the "addr" option to the nmount(2) syscall while nfs_mount()
> expects to
> receive it, which is the problem.
> Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to
> dereference
> addr value and page faults here in nfs_connect() :
> 
> vers = NFS_VER3;
> else if (nmp->nm_flag & NFSMNT_NFSV4)
> vers = NFS_VER4;
> XXX saddr is NULL, the next line will crash
> if (saddr->sa_family == AF_INET)
> if (nmp->nm_sotype == SOCK_DGRAM)
> nconf = getnetconfigent("udp");
> 
> I think that nfsclient, probably in
> sys/nfsclient/nfs_vfsops.c:mount_nfs(),
> should handle a missing value for "addr" and/or "fh" mount options.
> It doesn't check it currently:
> 
Yes, at least for the case of "addr". I'm not sure if a zero length
fh is considered ok for the old client or not. (It is valid for the
new one.)

I've attached a patch that does the check for the "addr=" option for
both clients. You can test that if you'd like. It should avoid the
crash.

Since "oldnfs" didn't exist as a file system type pre-r21124, I don't
think you can expect a pre-r211124 mount to be able to be done for it.
(It will work for the default "nfs", it will just use the new NFS client.)

> % static int
> % nfs_mount(struct mount *mp)
> % {
> % struct nfs_args args = {
> % [...]
> % .addr = NULL,
> % };
> % int error, ret, has_nfs_args_opt;
> % int has_addr_opt, has_fh_opt, has_hostname_opt;
> % struct sockaddr *nam;
> 
> addr is initialized with NULL. num used later as a pointer to
> args.addr value.
> 
> % if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) {
> % error = nfs_mountroot(mp);
> % goto out;
> % }
> 
> We do not try to mount root, this is not ours.
> 
> % if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) {
> [...]
> % has_nfs_args_opt = 1;
> % }
> 
> We do not use old mount(2) interface, not ours.
> 
> % if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0)
> % args.flags |= NFSMNT_NFSV3;
> 
> mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set.
> 
> % if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr,
> % &args.addrlen) == 0) {
> % has_addr_opt = 1;
> % if (args.addrlen > SOCK_MAXADDRLEN) {
> % error = ENAMETOOLONG;
> % goto out;
> % }
> % nam = malloc(args.addrlen, M_SONAME,
> % M_WAITOK);
> % bcopy(args.addr, nam, args.addrlen);
> % nam->sa_len = args.addrlen;
> % }
> 
> mount(8) doesn't pass addr option, so args.addr isn't set, hence
> struct sockaddr *nam is also NULL, has_addr_opt is 0.
> 
> % if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname,
> % NULL) == 0) {
> % has_hostname_opt = 1;
> % }
> % if (args.hostname == NULL) {
> % vfs_mount_error(mp, "Invalid hostname");
> % error = EINVAL;
> % goto out;
> % }
> 
> I don't know why I got here the error. I didn't analyze it deep
> though.
> "mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid
> argument"

You'll get this if there is no hostname="xxx" argument specified, which
I believe is correct.
------=_Part_490290_2136409836.1305679037410
Content-Type: text/x-patch; name=nfsmnt.patch
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename=nfsmnt.patch

LS0tIG5mc2NsaWVudC9uZnNfdmZzb3BzLmMuc2F2CTIwMTEtMDUtMTcgMTk6NDg6MTUuMDAwMDAw
MDAwIC0wNDAwCisrKyBuZnNjbGllbnQvbmZzX3Zmc29wcy5jCTIwMTEtMDUtMTcgMjA6MDA6NDYu
MDAwMDAwMDAwIC0wNDAwCkBAIC0xMTQ5LDYgKzExNDksMTAgQEAgbmZzX21vdW50KHN0cnVjdCBt
b3VudCAqbXApCiAJCQkJZ290byBvdXQ7CiAJCQl9CiAJCX0KKwl9IGVsc2UgaWYgKGhhc19hZGRy
X29wdCA9PSAwKSB7CisJCXZmc19tb3VudF9lcnJvcihtcCwgIk5vIHNlcnZlciBhZGRyZXNzIik7
CisJCWVycm9yID0gRUlOVkFMOworCQlnb3RvIG91dDsKIAl9CiAJZXJyb3IgPSBtb3VudG5mcygm
YXJncywgbXAsIG5hbSwgYXJncy5ob3N0bmFtZSwgJnZwLAogCSAgICBjdXJ0aHJlYWQtPnRkX3Vj
cmVkLCBuZWduYW1ldGltZW8pOwotLS0gZnMvbmZzY2xpZW50L25mc19jbHZmc29wcy5jLnNhdgky
MDExLTA1LTE3IDE4OjU2OjQ3LjAwMDAwMDAwMCAtMDQwMAorKysgZnMvbmZzY2xpZW50L25mc19j
bHZmc29wcy5jCTIwMTEtMDUtMTcgMjA6MTA6NDcuMDAwMDAwMDAwIC0wNDAwCkBAIC0xMDc5LDE1
ICsxMDc5LDIxIEBAIG5mc19tb3VudChzdHJ1Y3QgbW91bnQgKm1wKQogCQlkaXJwYXRoWzBdID0g
J1wwJzsKIAlkaXJsZW4gPSBzdHJsZW4oZGlycGF0aCk7CiAKLQlpZiAoaGFzX25mc19hcmdzX29w
dCA9PSAwICYmIHZmc19nZXRvcHQobXAtPm1udF9vcHRuZXcsICJhZGRyIiwKLQkgICAgKHZvaWQg
KiopJmFyZ3MuYWRkciwgJmFyZ3MuYWRkcmxlbikgPT0gMCkgewotCQlpZiAoYXJncy5hZGRybGVu
ID4gU09DS19NQVhBRERSTEVOKSB7Ci0JCQllcnJvciA9IEVOQU1FVE9PTE9ORzsKKwlpZiAoaGFz
X25mc19hcmdzX29wdCA9PSAwKSB7CisJCWlmICh2ZnNfZ2V0b3B0KG1wLT5tbnRfb3B0bmV3LCAi
YWRkciIsCisJCSAgICAodm9pZCAqKikmYXJncy5hZGRyLCAmYXJncy5hZGRybGVuKSA9PSAwKSB7
CisJCQlpZiAoYXJncy5hZGRybGVuID4gU09DS19NQVhBRERSTEVOKSB7CisJCQkJZXJyb3IgPSBF
TkFNRVRPT0xPTkc7CisJCQkJZ290byBvdXQ7CisJCQl9CisJCQluYW0gPSBtYWxsb2MoYXJncy5h
ZGRybGVuLCBNX1NPTkFNRSwgTV9XQUlUT0spOworCQkJYmNvcHkoYXJncy5hZGRyLCBuYW0sIGFy
Z3MuYWRkcmxlbik7CisJCQluYW0tPnNhX2xlbiA9IGFyZ3MuYWRkcmxlbjsKKwkJfSBlbHNlIHsK
KwkJCXZmc19tb3VudF9lcnJvcihtcCwgIk5vIHNlcnZlciBhZGRyZXNzIik7CisJCQllcnJvciA9
IEVJTlZBTDsKIAkJCWdvdG8gb3V0OwogCQl9Ci0JCW5hbSA9IG1hbGxvYyhhcmdzLmFkZHJsZW4s
IE1fU09OQU1FLCBNX1dBSVRPSyk7Ci0JCWJjb3B5KGFyZ3MuYWRkciwgbmFtLCBhcmdzLmFkZHJs
ZW4pOwotCQluYW0tPnNhX2xlbiA9IGFyZ3MuYWRkcmxlbjsKIAl9CiAKIAlhcmdzLmZoID0gbmZo
Owo=
------=_Part_490290_2136409836.1305679037410--

From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 06:29:51 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 0DD0D106564A
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 06:29:50 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id 5F2928FC0A
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 06:29:50 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id 12B8BC01C5
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 08:13:15 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id ell1upkySAWh; Wed, 18 May 2011 08:13:14 +0200 (CEST)
Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se
	[213.89.160.61])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id 5033DC01B4;
	Wed, 18 May 2011 08:13:14 +0200 (CEST)
From: Per von Zweigbergk <pvz@itassistans.se>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Date: Wed, 18 May 2011 08:13:13 +0200
To: freebsd-fs@freebsd.org
Message-Id: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
Mime-Version: 1.0 (Apple Message framework v1082)
X-Mailer: Apple Mail (2.1082)
Subject: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 06:29:51 -0000

I've been investigating HAST as a possibility in adding synchronous =
replication and failover to a set of two NFS servers backed by NFS. The =
servers themselves contain quite a few disks. 20 of them (7200 RPM SAS =
disks), to be exact. (If I didn't lose count again...) Plus two quick =
but small SSD's for ZIL and two not-as-quick but larger SSD's for L2ARC.

These machines weren't originally designed with synchronous replication =
in mind - they were designed to be NFS file servers (used as VMware data =
stores) backed by ZFS. They contain LSI MegaRaid 9260 controllers (as an =
aside, these were perhaps not the best choice for ZFS since they lack a =
true JBOD mode, I have worked around this by making single-disk RAID-0 =
arrays, and then using those single-disk arrays to make up the zpool).

Now, I've been considering making an active/passive (or, possibly, =
active/passive + passive/active) synchronously replicated pair of =
servers out of these, and my eyes fall on HAST.

Initially, my thoughts land on simply creating HAST resources for the =
corresponding pairs of disks and SSDs in servers A and B, and then using =
these HAST resources to make up the ZFS pool.

But this raises two questions:

---

1. Hardware failure management. In case of a hardware failure, I'm not =
exactly sure what will happen, but I suspect the single-disk RAID-0 =
array containing the failed disk will simply fail. I assume it will =
still exist, but refuse to be read or written. In this situation I =
understand HAST will handle this by routing all I/O to the secondary =
server, in case the disk on the primary side dies, or simply by cutting =
off replication if the disk on the secondary server fails.

I have not seen any "hot spare" mechanism in HAST, but I would think =
that I could edit the cluster configuration file to manually configure a =
hot spare in case I receive an alert. Would I have to restart all of =
hastd to do this, though? Or is it sufficient to bring the resource into =
init and back into secondary using hastctl?

Of course it may just be infinitely simpler just to configure spares on =
the ZFS level, and keep entire spare hast resources, and just do a zfs =
replace, replacing an entire array of two disks whenever one of the =
disks in an array fails. Still, it would be know what I can reconfigure =
on-the-fly with hast itself.

---

2. ZFS self-healing. As far as I understand it, ZFS does self-healing, =
in that all data is checksummed, and if one disk in a mirror happens to =
contain corrupted data, ZFS will re-read the same data from the other =
disk in the ZFS mirror. I don't see any way this could work in a =
configuration where ZFS is not mirroring itself, but rather, running on =
top of HAST, currently. Am I wrong about this? Or is there any way to =
achieve this same self-healing effect except with HAST?

---

So, what is it, do I have to give up ZFS's self healing (one of the =
really neat features in ZFS) if I go for HAST? Of course, I could mirror =
the drives first with HAST, and then mirror the two HAST mirrors using a =
zfs mirror, but that would be wasteful and a little silly. I might even =
be able to get away with using "copies=3D2" in this scenario. Or I could =
use raid-z on top of the mirrors, wasting less disk, but causing a =
performance hit.

I mean, ideally, ZFS would have a really neat synchronous replication =
feature built into it. Or ZFS could be HAST-aware, and know how to ask =
HAST to bring it a copy of a block of data on the remote block device in =
a HAST mirror in case the checksum on the local block device doesn't =
match. Or HAST would itself have some kind of block-level checksums, and =
do self-healing itself. (This would probably be the easiest to =
implement. The secondary site could even continually be reading the same =
data as the primary site is, merely to check the checksums on disk, not =
to send it over the wire. It's not like it's doing anything else useful =
with that untapped read performance.)

So, what's the current state of solving this problem? Is there any work =
being done in this area? Have I overlooked some technology I might use =
to achieve this goal?=

From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 07:59:48 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D8DCC106564A
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 07:59:48 +0000 (UTC)
	(envelope-from daniel@digsys.bg)
Received: from smtp-sofia.digsys.bg (smtp-sofia.digsys.bg [193.68.3.230])
	by mx1.freebsd.org (Postfix) with ESMTP id 77E498FC12
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 07:59:48 +0000 (UTC)
Received: from dcave.digsys.bg (dcave.digsys.bg [192.92.129.5])
	(authenticated bits=0)
	by smtp-sofia.digsys.bg (8.14.4/8.14.4) with ESMTP id p4I7xbB7038788
	(version=TLSv1/SSLv3 cipher=DHE-RSA-CAMELLIA256-SHA bits=256 verify=NO)
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 10:59:42 +0300 (EEST)
	(envelope-from daniel@digsys.bg)
Message-ID: <4DD37C69.5020005@digsys.bg>
Date: Wed, 18 May 2011 10:59:37 +0300
From: Daniel Kalchev <daniel@digsys.bg>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.15) Gecko/20110307 Thunderbird/3.1.9
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
In-Reply-To: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 07:59:48 -0000


On 18.05.11 09:13, Per von Zweigbergk wrote:
> I've been investigating HAST as a possibility in adding synchronous replication and failover to a set of two NFS servers backed by NFS. The servers themselves contain quite a few disks. 20 of them (7200 RPM SAS disks), to be exact. (If I didn't lose count again...) Plus two quick but small SSD's for ZIL and two not-as-quick but larger SSD's for L2ARC.
Your idea is to have hot standby server, to replace the primary, should 
the primary fail (hardware-wise)?
You need probably CAPR in addition to HAST in order to maintain the same 
shared IP address.

> Initially, my thoughts land on simply creating HAST resources for the corresponding pairs of disks and SSDs in servers A and B, and then using these HAST resources to make up the ZFS pool.
This would be the most natural decision, especially if you have 
identical hardware on both servers. Let's call this variant 1.

Variant 2, would be to create local ZFS pools (as you already have) and 
then create ZVOLs there, that are managed by HAST. Then, you will use 
the HAST provider  for whatever storage needs you have. Performance 
might not be what you expect and you need to trust HAST for the checksuming.

> 1. Hardware failure management. In case of a hardware failure, I'm not exactly sure what will happen, but I suspect the single-disk RAID-0 array containing the failed disk will simply fail. I assume it will still exist, but refuse to be read or written. In this situation I understand HAST will handle this by routing all I/O to the secondary server, in case the disk on the primary side dies, or simply by cutting off replication if the disk on the secondary server fails.
Having local ZFS makes hardware management easier, as ZFS is designed 
for this. This is variant 2.

In your case, with variant 1 you will have several issues:
- handle the disk failure and array management on the controller level. 
You need to check if this will work - you may end up with a new array 
name and thus having to edit config files.
- there is no hot spare mechanism in HAST and I do not believe you can 
switch to secondary easily. Switching to secondary will make the HAST 
device node disappear for sure on the primary server and reappear on the 
secondary server. Maybe someone might suggest proper way to handle this.

> 2. ZFS self-healing. As far as I understand it, ZFS does self-healing, in that all data is checksummed, and if one disk in a mirror happens to contain corrupted data, ZFS will re-read the same data from the other disk in the ZFS mirror. I don't see any way this could work in a configuration where ZFS is not mirroring itself, but rather, running on top of HAST, currently. Am I wrong about this? Or is there any way to achieve this same self-healing effect except with HAST?
HAST is simple mirror. It only makes sure blocks on the local and remove 
drives contains the same data. I do not believe it has strong enough 
checksuming to compare with ZFS. Therefore, your best bet is to use ZFS 
on top of HAST for best data protection.

In your example, you will need to create 20 HAST resources, out of each 
disk. Then create ZFS on top of these HAST resources. ZFS will then be 
able to heal itself in case there are inconsistencies with data on the 
HAST resources (for whatever reason).

Some reported they used HAST for the SLOG as well. I do not know if 
using HAST for the L2ARC makes any sense. On failure you will import the 
pool on the slave node and this will wipe the L2ARC anyway.

> I mean, ideally, ZFS would have a really neat synchronous replication feature built into it. Or ZFS could be HAST-aware, and know how to ask HAST to bring it a copy of a block of data on the remote block device in a HAST mirror in case the checksum on the local block device doesn't match. Or HAST would itself have some kind of block-level checksums, and do self-healing itself. (This would probably be the easiest to implement. The secondary site could even continually be reading the same data as the primary site is, merely to check the checksums on disk, not to send it over the wire. It's not like it's doing anything else useful with that untapped read performance.)
With HAST, no (hast) storage providers exist on the secondary node. 
Therefore, you cannot do any I/O on the secondary node, until it becomes 
primary.

I too, would be interested in the failure management scenario with 
HAST+ZFS, as I am currently experimenting with a similar system.

Daniel

From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 08:37:58 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id E3E63106564A
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 08:37:58 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id 7D82F8FC14
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 08:37:58 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id D48A5C01C6
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 10:37:56 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024) with ESMTP id Y2dn0jPVpUAT for <freebsd-fs@freebsd.org>;
	Wed, 18 May 2011 10:37:56 +0200 (CEST)
Received: from [10.0.10.11] (unknown [212.112.191.49])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id 36805C01C5
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 10:37:56 +0200 (CEST)
Message-ID: <4DD3855E.8020802@itassistans.se>
Date: Wed, 18 May 2011 10:37:50 +0200
From: Per von Zweigbergk <pvz@itassistans.se>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
	rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: freebsd-fs@freebsd.org
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<4DD37C69.5020005@digsys.bg>
In-Reply-To: <4DD37C69.5020005@digsys.bg>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 08:37:59 -0000

On 2011-05-18 09:59, Daniel Kalchev wrote:
> Your idea is to have hot standby server, to replace the primary, 
> should the primary fail (hardware-wise)?
> You need probably CAPR in addition to HAST in order to maintain the 
> same shared IP address.
Yes, CARP would be required to handle the actual failover.
>> Initially, my thoughts land on simply creating HAST resources for the 
>> corresponding pairs of disks and SSDs in servers A and B, and then 
>> using these HAST resources to make up the ZFS pool.
> This would be the most natural decision, especially if you have 
> identical hardware on both servers. Let's call this variant 1.
>
> Variant 2, would be to create local ZFS pools (as you already have) 
> and then create ZVOLs there, that are managed by HAST. Then, you will 
> use the HAST provider  for whatever storage needs you have. 
> Performance might not be what you expect and you need to trust HAST 
> for the checksuming.
This is a really neat idea, and it is going to be a ton easier to 
configure than anything else.

This would mean that you'd be running a stack looking like:
- ZFS on top of:
- One HAST resource on top of:
- Two ZVOLs, each on top of:
- ZFS on top of:
- Local storage (mirrored by zfs)

This still means data will be mirrored twice - stored on 4 HDDs, though, 
but the configuration will be a ton cleaner than managing a 20-resource 
HAST configuration monstrosity.

It would be an option to run VMFS on top exporting it over ISCSI rather 
than running ZFS on top exporting it over NFS. I have a feeling that 
might be less overhead in the end. Although it's less convenient from a 
management point of view (unless FreeBSD has gained the ability to mount 
VMFS while I wasn't looking)
>> 2. ZFS self-healing. As far as I understand it, ZFS does 
>> self-healing, in that all data is checksummed, and if one disk in a 
>> mirror happens to contain corrupted data, ZFS will re-read the same 
>> data from the other disk in the ZFS mirror. I don't see any way this 
>> could work in a configuration where ZFS is not mirroring itself, but 
>> rather, running on top of HAST, currently. Am I wrong about this? Or 
>> is there any way to achieve this same self-healing effect except with 
>> HAST?
> HAST is simple mirror. It only makes sure blocks on the local and 
> remove drives contains the same data. I do not believe it has strong 
> enough checksuming to compare with ZFS. Therefore, your best bet is to 
> use ZFS on top of HAST for best data protection.
Does it actually make sure the blocks on the local and remote drives 
contain the same data, though? I don't remember reading anything about a 
cross-check between the two drives in case of data corruption like ZFS 
does. Although in your described "variant 2" this won't be a problem.
> In your example, you will need to create 20 HAST resources, out of 
> each disk. Then create ZFS on top of these HAST resources. ZFS will 
> then be able to heal itself in case there are inconsistencies with 
> data on the HAST resources (for whatever reason).
>
> Some reported they used HAST for the SLOG as well. I do not know if 
> using HAST for the L2ARC makes any sense. On failure you will import 
> the pool on the slave node and this will wipe the L2ARC anyway.
Yes, running HAST on L2ARC doesn't make much sense, I'd have to run HAST 
on the ZIL though if I opted for Variant 1 (which I don't think I will).
>> I mean, ideally, ZFS would have a really neat synchronous replication 
>> feature built into it. Or ZFS could be HAST-aware, and know how to 
>> ask HAST to bring it a copy of a block of data on the remote block 
>> device in a HAST mirror in case the checksum on the local block 
>> device doesn't match. Or HAST would itself have some kind of 
>> block-level checksums, and do self-healing itself. (This would 
>> probably be the easiest to implement. The secondary site could even 
>> continually be reading the same data as the primary site is, merely 
>> to check the checksums on disk, not to send it over the wire. It's 
>> not like it's doing anything else useful with that untapped read 
>> performance.)
> With HAST, no (hast) storage providers exist on the secondary node. 
> Therefore, you cannot do any I/O on the secondary node, until it 
> becomes primary.
I did not mean accessing any of the storage on the secondary node 
itself, I meant accessing the blocks *as stored on the secondary node* 
on the primary node.

HAST will already do this in case of a read error on the primary node.

From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 08:53:13 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5E25E106566B
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 08:53:13 +0000 (UTC)
	(envelope-from pluknet@gmail.com)
Received: from mail-qw0-f54.google.com (mail-qw0-f54.google.com
	[209.85.216.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 157EB8FC08
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 08:53:12 +0000 (UTC)
Received: by qwc9 with SMTP id 9so909422qwc.13
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 01:53:12 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=EHXkU5MwRxgzAsiJfoNRUuzn5b0xlVphlzMGPJoqd1M=;
	b=vfuITiamL5oRe7jF94bkDeleoqelSabVAGnEQZY6uAVzcus8B8lXSp2LfdwOLoGVDi
	N0ku/nUOECmKxRo8M2hMNA7ZnMPjWPQUSSiWgvyjyQ1su2IfIWLINYWj+pGY2mx/t5Kq
	r+2rju2Iom0g8A6GNfZLavuYxX9JHaLzPoVRw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=mxmLYf4hiR1OpP7Ui09EYJQgJM9Q2wEQLYDtXxoCb8RXYELs+A1RudA6G3kWxwmJfp
	fTnpgzFa51d8S56hCUR3+bAXO76rVDFNVR2RWsJdPK0AJVYynBTA1rRBrDTSnowPd9Rg
	dUH8TfPjILkBH0yiq0jVBcFSw2FFCRqRZb2o8=
MIME-Version: 1.0
Received: by 10.229.67.142 with SMTP id r14mr1205257qci.209.1305708792220;
	Wed, 18 May 2011 01:53:12 -0700 (PDT)
Received: by 10.229.111.218 with HTTP; Wed, 18 May 2011 01:53:12 -0700 (PDT)
In-Reply-To: <713535812.490291.1305679037413.JavaMail.root@erie.cs.uoguelph.ca>
References: <BANLkTi=g=Y8=Ma=N1Ma_9-D0nP+673ay9Q@mail.gmail.com>
	<713535812.490291.1305679037413.JavaMail.root@erie.cs.uoguelph.ca>
Date: Wed, 18 May 2011 12:53:12 +0400
Message-ID: <BANLkTimvbZFcsdKPzEDWPot=XxMUU=-4-w@mail.gmail.com>
From: Sergey Kandaurov <pluknet@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Content-Type: text/plain; charset=ISO-8859-1
Cc: freebsd-fs@freebsd.org
Subject: Re: [old nfsclient] different nmount() args passed from mount vs.
	mount_nfs
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 08:53:13 -0000

On 18 May 2011 04:37, Rick Macklem <rmacklem@uoguelph.ca> wrote:
>> Hi.
>>
>> First, sorry for the long mail. I just tried to describe in full
>> details.
>>
>> When mounting nfs with some options, I found that /sbin/mount and
>> /sbin/mount_nfs pass options to nmount() differently, which results
>> in bad things (TM). I traced the options and here they are:
>>
>> From mount(8) -> mount_nfs(8):
>> "rw" -> ""
>> "addr" -> {something valid }
>> "fh" -> 5
>> "sec" -> "sys"
>> "nfsv3" -> 0x0 => NFSMNT_NFSV3
>> "hostname" -> "dev2.mail:/home/svn/freebsd/head"
>> "fstype" -> "oldnfs"
>> "fspath" -> "/usr/src"
>> "errmsg" -> ""
>> (nil)
>>
>> From pre-r221124 mount(8):
>> = "fstype" -> "oldnfs"
>> "hostname" -> "dev2.mail"
>> = "fspath" -> "/usr/src"
>> "from" -> "dev2.mail:/home/svn/freebsd/head"
>> = "errmsg" -> ""
>> (nil)
>>
>> Note, that pre-r221124 mount(8) knows nothing about oldnfs.
>>
>> 1. "hostname" option is passed differently from mount(8) and
>> mount_nfs(8).
>> When I force to mount oldnfs file system with mount(8) directly (to
>> not
>> bypass the nmount(2) call to mount_nfs(8)), I get this error:
>> ./mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
>> mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid
>> argument
>>
>> Hmm.. this may be because mount(8) passes value in $hostname:$path
>> format
>> (see the traces above). It might be due to different old nfsclient way
>> to parse
>> args, but I am not sure, I can be wrong. Anyway, it does not matter
>> now.
>>
>> The actual problem manifests when running the command with pre-r221124
>> mount(8) binary. It knows nothing about "oldnfs" and (attention!)
>> calls nmount(2)
>> directly instead of bypassing the call to the mount_nfs(8) binary as
>> usually done,
>> and this is the place where the "unsanitized nmount(2) args" problem
>> is hidden.
>> [New mount knows about "oldnfs" and passes the call to mount_oldnfs(8)
>> that
>> prepares all the nmount(2) args to correctly hide the problem.]
>>
>> To prove it, that is how old and new mount(8) work differently:
>> 1) new mount(8) as of current
>> mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
>> exec: mount_oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
>> 2) old mount(8) as of pre-r221124
>> ./mount -d -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
>> mount -t oldnfs dev2.mail:/home/svn/freebsd/head /usr/src
>>
>>
>> Ok, back to the first paragraph: a different "hostname" mount option.
>> When I first faced with this, I tried to specify value for "hostname"
>> explicitly. Here it comes:
>> ./mount -t oldnfs -o hostname=dev2.mail
>> dev2.mail:/home/svn/freebsd/head /usr/src
>> [CABOOM!]
>> It just crashed. Do not do this :)
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 0; apic id = 00
>> fault virtual address = 0x1
>> fault code = supervisor read data, page not present
>> instruction pointer = 0x20:0xffffffff805da299
>> stack pointer = 0x28:0xffffff807bef6240
>> frame pointer = 0x28:0xffffff807bef62a0
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = interrupt enabled, resume, IOPL = 0
>> current process = 2541 (mount)
>> db> bt
>> Tracing pid 2541 tid 100076 td 0xfffffe0001ace460
>> nfs_connect() at 0xffffffff805da299 = nfs_connect+0x79
>> nfs_request() at 0xffffffff805da978 = nfs_request+0x398
>> nfs_getattr() at 0xffffffff805e2a6c = nfs_getattr+0x2bc
>> VOP_GETATTR_APV() at 0xffffffff806f4283 = VOP_GETATTR_APV+0xd3
>> mountnfs() at 0xffffffff805de739 = mountnfs+0x329
>> nfs_mount() at 0xffffffff805dffc7 = nfs_mount+0xcf7
>> vfs_donmount() at 0xffffffff804d46ff = vfs_donmount+0x82f
>> nmount() at 0xffffffff804d54f3 = nmount+0x63
>> syscallenter() at 0xffffffff804861cb = syscallenter+0x1cb
>> syscall() at 0xffffffff806ae710 = syscall+0x60
>> Xfast_syscall() at 0xffffffff8069922d = Xfast_syscall+0xdd
>> --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800ab444c, rsp =
>> 0x7fffffffca48, rbp = 0x801009058 ---
>>
>>
>> As you might see from above nmount(2) args traces, mount(8) itself
>> doesn't
>> pass the "addr" option to the nmount(2) syscall while nfs_mount()
>> expects to
>> receive it, which is the problem.
>> Later deep in nmount(2) in /sys/nfsclient/nfs_krpc.c it tries to
>> dereference
>> addr value and page faults here in nfs_connect() :
>>
>> vers = NFS_VER3;
>> else if (nmp->nm_flag & NFSMNT_NFSV4)
>> vers = NFS_VER4;
>> XXX saddr is NULL, the next line will crash
>> if (saddr->sa_family == AF_INET)
>> if (nmp->nm_sotype == SOCK_DGRAM)
>> nconf = getnetconfigent("udp");
>>
>> I think that nfsclient, probably in
>> sys/nfsclient/nfs_vfsops.c:mount_nfs(),
>> should handle a missing value for "addr" and/or "fh" mount options.
>> It doesn't check it currently:
>>
> Yes, at least for the case of "addr". I'm not sure if a zero length
> fh is considered ok for the old client or not. (It is valid for the
> new one.)
>
> I've attached a patch that does the check for the "addr=" option for
> both clients. You can test that if you'd like. It should avoid the
> crash.

Thank you very much.
After patch applied, at least old nfsclient works as expected.
(I didn't test new nfsclient).
./mount -t oldnfs -o hostname=dev2.mail
dev2.mail:/home/svn/freebsd/head /usr/src
mount: dev2.mail:/home/svn/freebsd/head No server address: Invalid argument

Can you commit the patch?

>
> Since "oldnfs" didn't exist as a file system type pre-r21124, I don't
> think you can expect a pre-r211124 mount to be able to be done for it.

I see. My only concern was a crash.

> (It will work for the default "nfs", it will just use the new NFS client.)
>
>> % static int
>> % nfs_mount(struct mount *mp)
>> % {
>> % struct nfs_args args = {
>> % [...]
>> % .addr = NULL,
>> % };
>> % int error, ret, has_nfs_args_opt;
>> % int has_addr_opt, has_fh_opt, has_hostname_opt;
>> % struct sockaddr *nam;
>>
>> addr is initialized with NULL. num used later as a pointer to
>> args.addr value.
>>
>> % if ((mp->mnt_flag & (MNT_ROOTFS | MNT_UPDATE)) == MNT_ROOTFS) {
>> % error = nfs_mountroot(mp);
>> % goto out;
>> % }
>>
>> We do not try to mount root, this is not ours.
>>
>> % if (vfs_getopt(mp->mnt_optnew, "nfs_args", NULL, NULL) == 0) {
>> [...]
>> % has_nfs_args_opt = 1;
>> % }
>>
>> We do not use old mount(2) interface, not ours.
>>
>> % if (vfs_getopt(mp->mnt_optnew, "nfsv3", NULL, NULL) == 0)
>> % args.flags |= NFSMNT_NFSV3;
>>
>> mount(8) doesn't pass nfsv3 option, so NFSMNT_NFSV3 isn't set.
>>
>> % if (vfs_getopt(mp->mnt_optnew, "addr", (void **)&args.addr,
>> % &args.addrlen) == 0) {
>> % has_addr_opt = 1;
>> % if (args.addrlen > SOCK_MAXADDRLEN) {
>> % error = ENAMETOOLONG;
>> % goto out;
>> % }
>> % nam = malloc(args.addrlen, M_SONAME,
>> % M_WAITOK);
>> % bcopy(args.addr, nam, args.addrlen);
>> % nam->sa_len = args.addrlen;
>> % }
>>
>> mount(8) doesn't pass addr option, so args.addr isn't set, hence
>> struct sockaddr *nam is also NULL, has_addr_opt is 0.
>>
>> % if (vfs_getopt(mp->mnt_optnew, "hostname", (void **)&args.hostname,
>> % NULL) == 0) {
>> % has_hostname_opt = 1;
>> % }
>> % if (args.hostname == NULL) {
>> % vfs_mount_error(mp, "Invalid hostname");
>> % error = EINVAL;
>> % goto out;
>> % }
>>
>> I don't know why I got here the error. I didn't analyze it deep
>> though.
>> "mount: dev2.mail:/home/svn/freebsd/head Invalid hostname: Invalid
>> argument"
>
> You'll get this if there is no hostname="xxx" argument specified, which
> I believe is correct.

Yes, that's true. mount(8) doesn't specify a "hostname" option itself.

-- 
wbr,
pluknet

From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 20:37:40 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CEF02106566B
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 20:37:40 +0000 (UTC)
	(envelope-from rmacklem@uoguelph.ca)
Received: from esa-annu.mail.uoguelph.ca (esa-annu.mail.uoguelph.ca
	[131.104.91.36])
	by mx1.freebsd.org (Postfix) with ESMTP id 911C98FC14
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 20:37:40 +0000 (UTC)
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: ApwEAM8s1E2DaFvO/2dsb2JhbACEWaI6iHCtB5B9gSuBbIF7gQcEkBGHK4dm
X-IronPort-AV: E=Sophos;i="4.65,233,1304308800"; d="scan'208";a="121148675"
Received: from erie.cs.uoguelph.ca (HELO zcs3.mail.uoguelph.ca)
	([131.104.91.206])
	by esa-annu-pri.mail.uoguelph.ca with ESMTP; 18 May 2011 16:37:39 -0400
Received: from zcs3.mail.uoguelph.ca (localhost.localdomain [127.0.0.1])
	by zcs3.mail.uoguelph.ca (Postfix) with ESMTP id 79361B3F53;
	Wed, 18 May 2011 16:37:39 -0400 (EDT)
Date: Wed, 18 May 2011 16:37:39 -0400 (EDT)
From: Rick Macklem <rmacklem@uoguelph.ca>
To: FreeBSD FS <freebsd-fs@freebsd.org>
Message-ID: <5718691.545130.1305751059426.JavaMail.root@erie.cs.uoguelph.ca>
In-Reply-To: <20110517092011.GK48734@deviant.kiev.zoral.com.ua>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Originating-IP: [172.17.91.201]
X-Mailer: Zimbra 6.0.10_GA_2692 (ZimbraWebClient - FF3.0 (Win)/6.0.10_GA_2692)
Cc: 
Subject: Re: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 20:37:40 -0000

> Yes, the flag to specify the locking mode does only specify the
> minimal
> locking requirements, and filesystem is allowed to upgrade it to the
> more strict lock type. E.g. UFS would only return shared lock if the
> vnode was found in hash, AFAIR. If not told otherwise, getnewvnode(9)
> forces lockmgr to convert all lock requests into exclusive.
> 
That's exactly what UFS does, but I did notice some inconsistencies
w.r.t. the various file systems.

For VFS_VGET(), ffs/cd9660/udf do basically the following:
1	error = vfs_hash_get(mp, ino, flags, curthread, vpp, NULL, NULL);
        ...
2	if ((flags & LK_TYPE_MASK) == LK_SHARED) {
		flags &= ~LK_TYPE_MASK;
		flags |= LK_EXCLUSIVE;
	}
	...
3	lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL);
	...
4	error = vfs_hash_insert(vp, ino, flags, curthread, vpp, NULL, NULL);

but hpfs/ext2fs do something similar to the above, except
they omit step #2. (ie. They would do #4 with LK_SHARED, if
that was what flags is passed in as.)

Looking at vfs_hash_insert(), the "flags" argument is just
used for vget(), so it isn't obvious to me if it needs to
be LK_EXCLUSIVE or not.

So, does anyone know if this depend on the file system or are hpfs/ext2fs
broken?

Thanks in advance for any help with this, rick
ps: Fortunately, for my patch, I can just ignore the "flags"
    argument for VFS_FHTOVP() for the file systems I'm not
    sure about, so they'll just return LK_EXCLUSIVE locked
    vnodes.


From owner-freebsd-fs@FreeBSD.ORG  Wed May 18 23:24:30 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 93F00106566B
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 23:24:30 +0000 (UTC)
	(envelope-from kostikbel@gmail.com)
Received: from mail.zoral.com.ua (mx0.zoral.com.ua [91.193.166.200])
	by mx1.freebsd.org (Postfix) with ESMTP id E5F228FC14
	for <freebsd-fs@freebsd.org>; Wed, 18 May 2011 23:24:29 +0000 (UTC)
Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua
	[10.1.1.148])
	by mail.zoral.com.ua (8.14.2/8.14.2) with ESMTP id p4INOQ19046011
	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Thu, 19 May 2011 02:24:26 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1])
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4) with ESMTP id
	p4INOPhW004185; Thu, 19 May 2011 02:24:25 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
Received: (from kostik@localhost)
	by deviant.kiev.zoral.com.ua (8.14.4/8.14.4/Submit) id p4INOPUw004184; 
	Thu, 19 May 2011 02:24:25 +0300 (EEST)
	(envelope-from kostikbel@gmail.com)
X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to
	kostikbel@gmail.com using -f
Date: Thu, 19 May 2011 02:24:25 +0300
From: Kostik Belousov <kostikbel@gmail.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
Message-ID: <20110518232425.GX48734@deviant.kiev.zoral.com.ua>
References: <20110517092011.GK48734@deviant.kiev.zoral.com.ua>
	<5718691.545130.1305751059426.JavaMail.root@erie.cs.uoguelph.ca>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="P33LUqzLXAslwFyJ"
Content-Disposition: inline
In-Reply-To: <5718691.545130.1305751059426.JavaMail.root@erie.cs.uoguelph.ca>
User-Agent: Mutt/1.4.2.3i
X-Virus-Scanned: clamav-milter 0.95.2 at skuns.kiev.zoral.com.ua
X-Virus-Status: Clean
X-Spam-Status: No, score=-3.4 required=5.0 tests=ALL_TRUSTED,AWL,BAYES_00,
	DNS_FROM_OPENWHOIS autolearn=no version=3.2.5
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on
	skuns.kiev.zoral.com.ua
Cc: FreeBSD FS <freebsd-fs@freebsd.org>
Subject: Re: RFC: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 18 May 2011 23:24:30 -0000


--P33LUqzLXAslwFyJ
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, May 18, 2011 at 04:37:39PM -0400, Rick Macklem wrote:
> > Yes, the flag to specify the locking mode does only specify the
> > minimal
> > locking requirements, and filesystem is allowed to upgrade it to the
> > more strict lock type. E.g. UFS would only return shared lock if the
> > vnode was found in hash, AFAIR. If not told otherwise, getnewvnode(9)
> > forces lockmgr to convert all lock requests into exclusive.
> >=20
> That's exactly what UFS does, but I did notice some inconsistencies
> w.r.t. the various file systems.
>=20
> For VFS_VGET(), ffs/cd9660/udf do basically the following:
> 1	error =3D vfs_hash_get(mp, ino, flags, curthread, vpp, NULL, NULL);
>         ...
> 2	if ((flags & LK_TYPE_MASK) =3D=3D LK_SHARED) {
> 		flags &=3D ~LK_TYPE_MASK;
> 		flags |=3D LK_EXCLUSIVE;
> 	}
> 	...
> 3	lockmgr(vp->v_vnlock, LK_EXCLUSIVE, NULL);
> 	...
> 4	error =3D vfs_hash_insert(vp, ino, flags, curthread, vpp, NULL, NULL);
>=20
> but hpfs/ext2fs do something similar to the above, except
> they omit step #2. (ie. They would do #4 with LK_SHARED, if
> that was what flags is passed in as.)
>=20
> Looking at vfs_hash_insert(), the "flags" argument is just
> used for vget(), so it isn't obvious to me if it needs to
> be LK_EXCLUSIVE or not.
I would say that what ext2fs and hpfs trying to do is legitimate,
since the caller expects to get only the lock specified in the flags.

But, in fact, all locks for ext2fs and hpfs are exclusive, since
as I said in the previous message, getnewvnode() initializes vnode lock
for automatic converstion shared->exclusive, and ext2fs/hpfs do not
override this.
>=20
> So, does anyone know if this depend on the file system or are hpfs/ext2fs
> broken?
>=20
> Thanks in advance for any help with this, rick
> ps: Fortunately, for my patch, I can just ignore the "flags"
>     argument for VFS_FHTOVP() for the file systems I'm not
>     sure about, so they'll just return LK_EXCLUSIVE locked
>     vnodes.
>=20

--P33LUqzLXAslwFyJ
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (FreeBSD)

iEYEARECAAYFAk3UVSkACgkQC3+MBN1Mb4j9PgCgpdZeYsOjTmCr7j9Bj87nTtKl
/aAAoOdggCkJAm/feMdoMhOwIfifOefi
=yn0L
-----END PGP SIGNATURE-----

--P33LUqzLXAslwFyJ--

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 01:09:52 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F20F8106566B
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 01:09:52 +0000 (UTC)
	(envelope-from zack.kirsch@isilon.com)
Received: from seaxch10.isilon.com (seaxch10.isilon.com [74.85.160.26])
	by mx1.freebsd.org (Postfix) with ESMTP id D67D08FC12
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 01:09:52 +0000 (UTC)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
	charset="utf-8"
Content-Transfer-Encoding: base64
Date: Wed, 18 May 2011 18:09:50 -0700
Message-ID: <476FC2247D6C7843A4814ED64344560C03EC9A5E@seaxch10.desktop.isilon.com>
In-Reply-To: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca>
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9
Thread-Index: AcwUJVR+y7oTlZiSQ+2b21kpzuDndwBm+p/g
References: <256284561.428250.1305590315172.JavaMail.root@erie.cs.uoguelph.ca>
From: "Zack Kirsch" <zack.kirsch@isilon.com>
To: "Rick Macklem" <rmacklem@uoguelph.ca>,
	"FreeBSD FS" <freebsd-fs@freebsd.org>
Cc: 
Subject: RE: adding a lock flags argument to VFS_FHTOVP() for FreeBSD9
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 01:09:53 -0000

QnR3LCB3ZSd2ZSBpbXBsZW1lbnRlZCBleGFjdGx5IHRoaXMgYXQgSXNpbG9uIGFuZCBkbyB0YWtl
IFNIQVJFRCBsb2NrcyBpbnN0ZWFkIG9mIEVYQ0xVU0lWRSBmb3IgbWFueSBvcGVyYXRpb25zLiBJ
J20gZGVmaW5pdGVseSBpbiBzdXBwb3J0IG9mIHRoZSBpZGVhLg0KIA0KWmFjaw0KDQotLS0tLU9y
aWdpbmFsIE1lc3NhZ2UtLS0tLQ0KRnJvbTogb3duZXItZnJlZWJzZC1mc0BmcmVlYnNkLm9yZyBb
bWFpbHRvOm93bmVyLWZyZWVic2QtZnNAZnJlZWJzZC5vcmddIE9uIEJlaGFsZiBPZiBSaWNrIE1h
Y2tsZW0NClNlbnQ6IE1vbmRheSwgTWF5IDE2LCAyMDExIDQ6NTkgUE0NClRvOiBGcmVlQlNEIEZT
DQpTdWJqZWN0OiBSRkM6IGFkZGluZyBhIGxvY2sgZmxhZ3MgYXJndW1lbnQgdG8gVkZTX0ZIVE9W
UCgpIGZvciBGcmVlQlNEOQ0KDQpIaSwNCg0KRG93biB0aGUgcm9hZCwgSSB3b3VsZCBsaWtlIHRo
ZSBORlMgc2VydmVyIHRvIGJlIGFibGUgdG8gZG8gYQ0KICBWRlNfRkhUT1ZQKG1wLCAmZmhwLT5m
aF9maWQsIExLX1NIQVJFRCwgdnBwKTsNCnNpbWlsYXIgdG8gd2hhdCBpcyBhbHJlYWR5IHN1cHBv
cnRlZCBmb3IgVkZTX1ZHRVQoKS4gVGhlIHJlYXNvbg0KaXMgdGhhdCwgY3VycmVudGx5LCB3aGVu
IGEgY2xpZW50IGRvZXMgcmVhZC1haGVhZHMsIHRoZXNlIHJlYWRzIGFyZQ0KYmFzaWNhbGx5IHNl
cmlhbGl6ZWQgYmVjYXVzZSB0aGUgVkZTX0ZIVE9WUCgpIGdldHMgYW4gTEtfRVhDTFVTSVZFDQps
b2NrZWQgdm5vZGUgZm9yIGVhY2ggUlBDIG9uIHRoZSBzZXJ2ZXIuDQoNCkxpa2UgVkZTX1ZHRVQo
KSwgdGhlIHVuZGVybHlpbmcgZmlsZSBzeXN0ZW0gY2FuIHN0aWxsIGNob29zZSB0bw0KcmV0dXJu
IGEgTEtfRVhDTFVTSVZFIGxvY2tlZCB2bm9kZSBldmVuIHdoZW4gTEtfU0hBUkVEIGlzIHNwZWNp
ZmllZC4NCihTb21lIGZpbGUgc3lzdGVtcywgc3VjaCBhcyBGRlMsIGp1c3QgY2FsbCBWRlNfVkdF
VCgpIGluIFZGU19GSFRPVlAoKSwNCiBzbyBhbGwgdGhhdCBoYXBwZW5zIGlzIHRoYXQgdGhlIGZs
YWcgaXMgcGFzc2VkIHRocm91Z2ggdG8gVkZTX1ZHRVQoKQ0KIGZvciB0aG9zZSBvbmVzLikNCg0K
VG8gbWluaW1pemUgdGhlIHJpc2sgb2YgdGhlIHBhdGNoIGJyZWFraW5nIHNvbWV0aGluZywgSSBo
YXZlIGl0IHNldHRpbmcNCkxLX0VYQ0xVU0lWRSBmb3IgYWxsIFZGU19GSFRPVlAoKSBjYWxscyBz
byB0aGF0IHRoZSBzZW1hbnRpY3MgZG9uJ3QNCmFjdHVhbGx5IGNoYW5nZS4gKENoYW5naW5nIHRo
ZSBORlMgc2VydmVyIHRvIHVzZSBMS19TSEFSRUQgaXMgYSB0cml2aWFsDQpwYXRjaCwgYnV0IHdp
bGwgbmVlZCBleHRlbnNpdmUgdGVzdGluZywgc28gSSdtIG5vdCBwbGFubmluZyBvbiB0aGF0DQpj
aGFuZ2UgZm9yIDkuMC4pDQoNCklmIHlvdSBhcmUgaW50ZXJlc3RlZCwgbXkgY3VycmVudCBwYXRj
aCBpcyBhdDoNCiAgaHR0cDovL3Blb3BsZS5mcmVlYnNkLm9yZy9+cm1hY2tsZW0vZmh0b3ZwLnBh
dGNoDQoNClNvLCBkb2VzIHRoaXMgc291bmQgbGlrZSBhIHJlYXNvbmFibGUgdGhpbmcgdG8gY29t
bWl0LCBvbmNlIHRoZSBwYXRjaA0KaXMgcmV2aWV3ZWQ/DQoNCnJpY2sNCl9fX19fX19fX19fX19f
X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fDQpmcmVlYnNkLWZzQGZyZWVic2Qub3Jn
IG1haWxpbmcgbGlzdA0KaHR0cDovL2xpc3RzLmZyZWVic2Qub3JnL21haWxtYW4vbGlzdGluZm8v
ZnJlZWJzZC1mcw0KVG8gdW5zdWJzY3JpYmUsIHNlbmQgYW55IG1haWwgdG8gImZyZWVic2QtZnMt
dW5zdWJzY3JpYmVAZnJlZWJzZC5vcmciDQo=

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 08:49:50 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C6559106566C
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 08:49:50 +0000 (UTC)
	(envelope-from grarpamp@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 999028FC14
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 08:49:50 +0000 (UTC)
Received: by pwj8 with SMTP id 8so1467229pwj.13
	for <multiple recipients>; Thu, 19 May 2011 01:49:50 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:date:message-id:subject:from:to:cc
	:content-type; bh=W90XvgAHYqw9Plm0gVBUGFKVuA5et0IbX7ACZcYRPbo=;
	b=QbH3uZsx5swpINGN4UCeslcNQ52gyogkIzbhedHSKmkY69wrYDUtZ7Q/eOAaJHiVCe
	C0zGp87VhdLXaR7qY1G4BQ+ou8q69wNtY+fNCyDjiw+RVhHR5fNgwtmLe8t2CDnMzyd9
	Zb5v7fQ3qgJOEQNMYzdsBIJEodmysNbburxdI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:cc:content-type;
	b=HWQz9izMPdql1M+ptryHLsY7JjL81FQFFkjitAEuGdiadLYinoXvuH5TBATgq1Dvt9
	MHvSrxowvtGVq0cndWMtFs/cBrifc2uJyhrQNQy1fEdoG0qZ7yoh8qRjaXV9+aDLtN+d
	C9C1eBGK9GJETscnLXXJscToMIwMz3zRBY/wY=
MIME-Version: 1.0
Received: by 10.142.121.41 with SMTP id t41mr1641681wfc.358.1305779762948;
	Wed, 18 May 2011 21:36:02 -0700 (PDT)
Received: by 10.142.157.2 with HTTP; Wed, 18 May 2011 21:36:02 -0700 (PDT)
Date: Thu, 19 May 2011 00:36:02 -0400
Message-ID: <BANLkTins=3iJOhfu2SBCUrhsvKk58M8nPw@mail.gmail.com>
From: grarpamp <grarpamp@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-questions@freebsd.org
Subject: UDF and DVD's
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 08:49:50 -0000

Greetings... :)

The first filesystem DVD... other than a movie DVD (DVD-VIDEO?),
and the FreeBSD make release DVD's (iso9660)... that I've ever tried
to mount, well... don't. It is:
 Windows 7 Ultimate with Service Pack 1 (x64) - DVD (English) 5/12/2011
You can find the SHA-1 hash here:
 http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx
and a sample image, if needed for reference purposes, via any search
engine.

Anyways, after a little reasearch, does FreeBSD not, in fact, support
this UDF version? (I don't yet know how to supply the version of
this image for you?)

Can the FreeBSD team implement it? Perhaps by porting from NetBSD
5.1's seemingly near complete implementation?
 http://en.wikipedia.org/wiki/Universal_Disk_Format
 http://www.osta.org/specs/index.htm
As perhaps even a GSOC or Foundation project? Because reading retail
optical filesystem formats would seem to be a rather expected
capability?

I'm guessing the current state within FreeBSD means that I can
neither read, nor create, or write, readable (compatible) images
at this, or any given, UDF level?

As I've no other DVD's to test with... what UDF versions are most
DVD data ROM's published in?

Is this a blocker for FreeBSD?

For me, at least, minimally, that seems to be the case... as I now
have no way to rip, mount and add the files to this DVD that I would
like to add. Except to use Windows, which I consider to be unreliable
at best.

Thoughts? Thanks :)

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 09:14:36 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 15D2E10656AC
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 09:14:36 +0000 (UTC)
	(envelope-from jdc@koitsu.dyndns.org)
Received: from qmta10.emeryville.ca.mail.comcast.net
	(qmta10.emeryville.ca.mail.comcast.net [76.96.30.17])
	by mx1.freebsd.org (Postfix) with ESMTP id F09508FC13
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 09:14:35 +0000 (UTC)
Received: from omta23.emeryville.ca.mail.comcast.net ([76.96.30.90])
	by qmta10.emeryville.ca.mail.comcast.net with comcast
	id lMBN1g0011wfjNsAAMEaAB; Thu, 19 May 2011 09:14:34 +0000
Received: from koitsu.dyndns.org ([67.180.84.87])
	by omta23.emeryville.ca.mail.comcast.net with comcast
	id lMEZ1g00V1t3BNj8jMEagm; Thu, 19 May 2011 09:14:34 +0000
Received: by icarus.home.lan (Postfix, from userid 1000)
	id 6FCEA102C19; Thu, 19 May 2011 02:14:33 -0700 (PDT)
Date: Thu, 19 May 2011 02:14:33 -0700
From: Jeremy Chadwick <freebsd@jdc.parodius.com>
To: grarpamp <grarpamp@gmail.com>
Message-ID: <20110519091433.GA94053@icarus.home.lan>
References: <BANLkTins=3iJOhfu2SBCUrhsvKk58M8nPw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <BANLkTins=3iJOhfu2SBCUrhsvKk58M8nPw@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org
Subject: Re: UDF and DVD's
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 09:14:36 -0000

On Thu, May 19, 2011 at 12:36:02AM -0400, grarpamp wrote:
> Greetings... :)
> 
> The first filesystem DVD... other than a movie DVD (DVD-VIDEO?),
> and the FreeBSD make release DVD's (iso9660)... that I've ever tried
> to mount, well... don't. It is:
>  Windows 7 Ultimate with Service Pack 1 (x64) - DVD (English) 5/12/2011
> You can find the SHA-1 hash here:
>  http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx
> and a sample image, if needed for reference purposes, via any search
> engine.
> 
> Anyways, after a little reasearch, does FreeBSD not, in fact, support
> this UDF version? (I don't yet know how to supply the version of
> this image for you?)
> 
> Can the FreeBSD team implement it? Perhaps by porting from NetBSD
> 5.1's seemingly near complete implementation?
>  http://en.wikipedia.org/wiki/Universal_Disk_Format
>  http://www.osta.org/specs/index.htm
> As perhaps even a GSOC or Foundation project? Because reading retail
> optical filesystem formats would seem to be a rather expected
> capability?
> 
> I'm guessing the current state within FreeBSD means that I can
> neither read, nor create, or write, readable (compatible) images
> at this, or any given, UDF level?
> 
> As I've no other DVD's to test with... what UDF versions are most
> DVD data ROM's published in?
> 
> Is this a blocker for FreeBSD?
> 
> For me, at least, minimally, that seems to be the case... as I now
> have no way to rip, mount and add the files to this DVD that I would
> like to add. Except to use Windows, which I consider to be unreliable
> at best.
> 
> Thoughts? Thanks :)

Thoughts: please provide commands, full output, etc. that show how
you're trying to mount the disc, as well as relevant /dev entries
pertaining to your DVD drive.  dmesg might also be helpful.  And I
assume you have looked at mount_udf(8)?

-- 
| Jeremy Chadwick                                   jdc@parodius.com |
| Parodius Networking                       http://www.parodius.com/ |
| UNIX Systems Administrator                  Mountain View, CA, USA |
| Making life hard for others since 1977.               PGP 4BD6C0CB |


From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 09:53:33 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 1233)
	id 0895C106566B; Thu, 19 May 2011 09:53:33 +0000 (UTC)
Date: Thu, 19 May 2011 09:53:33 +0000
From: Alexander Best <arundel@freebsd.org>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Message-ID: <20110519095333.GA43066@freebsd.org>
References: <BANLkTins=3iJOhfu2SBCUrhsvKk58M8nPw@mail.gmail.com>
	<20110519091433.GA94053@icarus.home.lan>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20110519091433.GA94053@icarus.home.lan>
Cc: freebsd-fs@freebsd.org, grarpamp <grarpamp@gmail.com>,
	freebsd-questions@freebsd.org
Subject: Re: UDF and DVD's
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 09:53:33 -0000

On Thu May 19 11, Jeremy Chadwick wrote:
> On Thu, May 19, 2011 at 12:36:02AM -0400, grarpamp wrote:
> > Greetings... :)
> > 
> > The first filesystem DVD... other than a movie DVD (DVD-VIDEO?),
> > and the FreeBSD make release DVD's (iso9660)... that I've ever tried
> > to mount, well... don't. It is:
> >  Windows 7 Ultimate with Service Pack 1 (x64) - DVD (English) 5/12/2011
> > You can find the SHA-1 hash here:
> >  http://msdn.microsoft.com/en-us/subscriptions/downloads/default.aspx
> > and a sample image, if needed for reference purposes, via any search
> > engine.
> > 
> > Anyways, after a little reasearch, does FreeBSD not, in fact, support
> > this UDF version? (I don't yet know how to supply the version of
> > this image for you?)
> > 
> > Can the FreeBSD team implement it? Perhaps by porting from NetBSD
> > 5.1's seemingly near complete implementation?
> >  http://en.wikipedia.org/wiki/Universal_Disk_Format
> >  http://www.osta.org/specs/index.htm
> > As perhaps even a GSOC or Foundation project? Because reading retail
> > optical filesystem formats would seem to be a rather expected
> > capability?
> > 
> > I'm guessing the current state within FreeBSD means that I can
> > neither read, nor create, or write, readable (compatible) images
> > at this, or any given, UDF level?
> > 
> > As I've no other DVD's to test with... what UDF versions are most
> > DVD data ROM's published in?
> > 
> > Is this a blocker for FreeBSD?
> > 
> > For me, at least, minimally, that seems to be the case... as I now
> > have no way to rip, mount and add the files to this DVD that I would
> > like to add. Except to use Windows, which I consider to be unreliable
> > at best.
> > 
> > Thoughts? Thanks :)

freebsd as of now has two problems:

1) it only supports UDF 1.x and *not* UDF 2.x.

2) it does not properly support iso9660 with files > 4gb via multiple extents.
   whenever you mount such a dvd, you see each 4gb file twice.

cheers.
alex

ps: for 2) see http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/95222

> 
> Thoughts: please provide commands, full output, etc. that show how
> you're trying to mount the disc, as well as relevant /dev entries
> pertaining to your DVD drive.  dmesg might also be helpful.  And I
> assume you have looked at mount_udf(8)?
> 
> -- 
> | Jeremy Chadwick                                   jdc@parodius.com |
> | Parodius Networking                       http://www.parodius.com/ |
> | UNIX Systems Administrator                  Mountain View, CA, USA |
> | Making life hard for others since 1977.               PGP 4BD6C0CB |
> 

-- 
a13x

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 14:55:48 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@hub.freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 44E76106564A;
	Thu, 19 May 2011 14:55:48 +0000 (UTC)
	(envelope-from linimon@FreeBSD.org)
Received: from freefall.freebsd.org (freefall.freebsd.org
	[IPv6:2001:4f8:fff6::28])
	by mx1.freebsd.org (Postfix) with ESMTP id 1C2868FC0A;
	Thu, 19 May 2011 14:55:48 +0000 (UTC)
Received: from freefall.freebsd.org (localhost [127.0.0.1])
	by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id p4JEtlVB074177;
	Thu, 19 May 2011 14:55:47 GMT
	(envelope-from linimon@freefall.freebsd.org)
Received: (from linimon@localhost)
	by freefall.freebsd.org (8.14.4/8.14.4/Submit) id p4JEtlvA074173;
	Thu, 19 May 2011 14:55:47 GMT (envelope-from linimon)
Date: Thu, 19 May 2011 14:55:47 GMT
Message-Id: <201105191455.p4JEtlvA074173@freefall.freebsd.org>
To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org
From: linimon@FreeBSD.org
Cc: 
Subject: Re: kern/157179: [zfs] zfs/dbuf.c: panic: solaris assert:
	arc_buf_remove_ref(db->db_buf, db) == 0
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 14:55:48 -0000

Old Synopsis: zfs/dbuf.c: panic: solaris assert: arc_buf_remove_ref(db->db_buf, db) == 0
New Synopsis: [zfs] zfs/dbuf.c: panic: solaris assert: arc_buf_remove_ref(db->db_buf, db) == 0

Responsible-Changed-From-To: freebsd-bugs->freebsd-fs
Responsible-Changed-By: linimon
Responsible-Changed-When: Thu May 19 14:55:35 UTC 2011
Responsible-Changed-Why: 
Over to maintainer(s).

http://www.freebsd.org/cgi/query-pr.cgi?pr=157179

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 14:55:55 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 4EE2C106564A;
	Thu, 19 May 2011 14:55:55 +0000 (UTC)
	(envelope-from jwd@SlowBlink.Com)
Received: from nmail.slowblink.com (rrcs-24-199-145-34.midsouth.biz.rr.com
	[24.199.145.34])
	by mx1.freebsd.org (Postfix) with ESMTP id 10FC08FC14;
	Thu, 19 May 2011 14:55:54 +0000 (UTC)
Received: from nmail.slowblink.com (localhost [127.0.0.1])
	by nmail.slowblink.com (8.14.3/8.14.3) with ESMTP id p4JEdZEQ083204;
	Thu, 19 May 2011 10:39:35 -0400 (EDT)
	(envelope-from jwd@nmail.slowblink.com)
Received: (from jwd@localhost)
	by nmail.slowblink.com (8.14.3/8.14.3/Submit) id p4JEdZmd083203;
	Thu, 19 May 2011 10:39:35 -0400 (EDT) (envelope-from jwd)
Date: Thu, 19 May 2011 10:39:35 -0400
From: John D <jwd@SlowBlink.Com>
To: freebsd-fs@freebsd.org, freebsd-geom@freebsd.org
Message-ID: <20110519143935.GA83122@slowblink.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.4.2.3i
Cc: 
Subject: LSI 9200-8e/gmultipath/ZFS cable pull kernel crash
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 14:55:55 -0000

Hi Folks,

   Looking for a bit of help to debug a sas interconnect, gmultipath,
and zfs filesystem crash when the 2nd cable is pulled. Apologies for
the cross post to geom & fs, hoped I would catch the right folks.

   In general, I have two systems each with an LSI 9200-8e sas hba installed.
Each adapter has two cables going to shelf 1, then to shelf 2, then to the
second system.

System1 <---> Shelf1 <---> Shelf2 <---> System2
System1 <---> Shelf1 <---> Shelf2 <---> System2

   Typical stuff, system1 & 2 are carp'd together, if one system
goes down, the second zfs imports the pools and takes over.

   If I do a test pull of one of the cables, multipath removes the failed
providers correctly with no interupt to the filesystem. Reinstalling the
cable causes the providers to be re-integrated. This cable can be pull/reinstalled
multiple times with no problem. However, pulling the 2nd causes a kernel
crash.

   I have the configuration/logs/screen shots here:

http://people.freebsd.org/~jwd/lsi_gmultipath_zfs.html

   I can replicate the problem on demand. Any help debugging this problem
is appreciated.

Thanks!
john


From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 16:56:56 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 03139106564A
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 16:56:56 +0000 (UTC)
	(envelope-from piotr.kucharski@42.pl)
Received: from mail-ew0-f54.google.com (mail-ew0-f54.google.com
	[209.85.215.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 8BBBB8FC14
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 16:56:55 +0000 (UTC)
Received: by ewy1 with SMTP id 1so1284501ewy.13
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 09:56:54 -0700 (PDT)
Received: by 10.204.73.206 with SMTP id r14mr1284954bkj.181.1305822609153;
	Thu, 19 May 2011 09:30:09 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.204.38.137 with HTTP; Thu, 19 May 2011 09:29:29 -0700 (PDT)
X-Originating-IP: [224.9.88.219]
In-Reply-To: <AANLkTikjKwee9o2RGa1NLFHodV6kEVkSfSsNEw_25yMZ@mail.gmail.com>
References: <AANLkTim6z5KiceXQE-DHt51TBs+SO8NNLpi7fYBTwXpE@mail.gmail.com>
	<AANLkTimDqcdNv0DqyScEQU3Fw=t9xtOaAZqy2BfWNhpw@mail.gmail.com>
	<AANLkTik+UhMSfGgNPr=qd_Rfwb-12aXHWGAJnNGQH2j=@mail.gmail.com>
	<AANLkTikjKwee9o2RGa1NLFHodV6kEVkSfSsNEw_25yMZ@mail.gmail.com>
From: Piotr Kucharski <piotr.kucharski@42.pl>
Date: Thu, 19 May 2011 18:29:29 +0200
Message-ID: <BANLkTi=4OZr8WnRPdkn4=yULnZR+tjBFgA@mail.gmail.com>
To: Adam Vande More <amvandemore@gmail.com>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: very slow zfs scrub
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 16:56:56 -0000

On Thu, Feb 24, 2011 at 21:00, Adam Vande More <amvandemore@gmail.com> wrot=
e:
>> Wow! What does scrub do that it slows ggate drive almost to halt?
>>
>> What can I do to fix it?
>
> I think network latency is going to have huge impact on performance here.
> Have you tried any ggate or nic tuning?=C2=A0 Would HAST be an option for=
 you?=C2=A0 I
> think it has more performance thought put into it.
>

Well, the network seems rather idle, host and client share the same
1Gb LAN (not sure if the same switch, though) with <0.2ms rtt for 1k
packets in ping. When not scrubbing, sequential reads are
satisfactory. I'm inclined to think some read or write pattern of
scrub that is causing ggate to suck immensely. :/

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 17:08:08 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BFDC61065673
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 17:08:08 +0000 (UTC)
	(envelope-from pawel@dawidek.net)
Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60])
	by mx1.freebsd.org (Postfix) with ESMTP id 645158FC1B
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 17:08:07 +0000 (UTC)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 93FB145C9F; Thu, 19 May 2011 19:08:06 +0200 (CEST)
Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id 5363145683;
	Thu, 19 May 2011 19:08:01 +0200 (CEST)
Date: Thu, 19 May 2011 19:07:40 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: Per von Zweigbergk <pvz@itassistans.se>
Message-ID: <20110519170740.GA2100@garage.freebsd.pl>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<4DD37C69.5020005@digsys.bg> <4DD3855E.8020802@itassistans.se>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="yrj/dFKFPuw6o+aM"
Content-Disposition: inline
In-Reply-To: <4DD3855E.8020802@itassistans.se>
X-OS: FreeBSD 9.0-CURRENT amd64
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL 
	autolearn=no version=3.0.4
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 17:08:08 -0000


--yrj/dFKFPuw6o+aM
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, May 18, 2011 at 10:37:50AM +0200, Per von Zweigbergk wrote:
[...]
> This would mean that you'd be running a stack looking like:
> - ZFS on top of:
> - One HAST resource on top of:
> - Two ZVOLs, each on top of:
> - ZFS on top of:
> - Local storage (mirrored by zfs)

Having recursive ZFS pools is bad idea and most likely it was cause
deadocks. You also pay all the costs with checksumming, ARC cache, etc.
twice. Very bad idea.

> >Some reported they used HAST for the SLOG as well. I do not know
> >if using HAST for the L2ARC makes any sense. On failure you will
> >import the pool on the slave node and this will wipe the L2ARC
> >anyway.
> Yes, running HAST on L2ARC doesn't make much sense, I'd have to run
> HAST on the ZIL though if I opted for Variant 1 (which I don't think
> I will).

Using HAST for L2ARC devices might make no sense, but they are part of
the pool. So if you import the pool on another machine L2ARC device will
be failed. You may experiment with importing the pool, removing current
L2ARC devices and attaching machine-local L2ARC devices. This way you
avoid HAST for L2ARC, but not sure how reliable can that be.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://yomoli.com

--yrj/dFKFPuw6o+aM
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAk3VTlwACgkQForvXbEpPzTwtQCg83O//7AdOSAZDbscZT+WTliT
YK0An0DKUe1/1hqtY2ZyjqqzJ5kO6ftD
=bJn8
-----END PGP SIGNATURE-----

--yrj/dFKFPuw6o+aM--

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 17:16:05 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id CA3AB1065673
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 17:16:05 +0000 (UTC)
	(envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
	[IPv6:2a01:4f8:131:60a2::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 5C98E8FC1A
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 17:16:05 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
	[IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3])
	(Authenticated sender: lev@serebryakov.spb.ru)
	by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id BC3214AC1C
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 21:16:03 +0400 (MSD)
Date: Thu, 19 May 2011 21:15:59 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1409064431.20110519211559@serebryakov.spb.ru>
To: freebsd-fs@freebsd.org
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Subject: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 17:16:05 -0000

Hello, Freebsd-fs.

  I have /usr/home partition on my new server which is 400GiB (only
17GiB is used). It is UFS2, SoftUpdates are enabled.

  I want to backup it on live system, but 4 times out of 5 I got
(after 10-12 minutes of wait! Oh my, 10 minutes to create snapshot!):

mksnap_ffs: Cannot create snapshot /usr/home/.snap/dump_snapshot: Resource =
temporarily unavailable
dump: Cannot create /usr/home/.snap/dump_snapshot: No such file or directory

  It is FreeBSD 8.2-STABLE/amd64, 8GiB of memory.

  I've never encounter such problem on previous server, which has
 about 80GiB (with 20GiB used).

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 17:18:02 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 44F5D106564A
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 17:18:02 +0000 (UTC)
	(envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
	[IPv6:2a01:4f8:131:60a2::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 035728FC1B
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 17:18:02 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
	[IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3])
	(Authenticated sender: lev@serebryakov.spb.ru)
	by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 4306F4AC1C
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 21:17:59 +0400 (MSD)
Date: Thu, 19 May 2011 21:17:55 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD
X-Priority: 3 (Normal)
Message-ID: <1606289061.20110519211755@serebryakov.spb.ru>
To: freebsd-fs@freebsd.org
In-Reply-To: <1409064431.20110519211559@serebryakov.spb.ru>
References: <1409064431.20110519211559@serebryakov.spb.ru>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 17:18:02 -0000

Hello, Freebsd-fs.
You wrote 19 =EC=E0=FF 2011 =E3., 21:15:59:

>   I have /usr/home partition on my new server which is 400GiB (only
> 17GiB is used). It is UFS2, SoftUpdates are enabled.
>   I want to backup it on live system, but 4 times out of 5 I got
> (after 10-12 minutes of wait! Oh my, 10 minutes to create snapshot!):
  And server is almost unusable for these 10 minutes.

>   I've never encounter such problem on previous server, which has
>  about 80GiB (with 20GiB used).
  It takes about 30 seconds on this FS...

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 18:15:06 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 432BB106564A
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 18:15:06 +0000 (UTC)
	(envelope-from pawel@dawidek.net)
Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60])
	by mx1.freebsd.org (Postfix) with ESMTP id 87FF08FC19
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 18:15:04 +0000 (UTC)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 30DF845CAC; Thu, 19 May 2011 20:15:03 +0200 (CEST)
Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id 06C9545CDC;
	Thu, 19 May 2011 20:14:56 +0200 (CEST)
Date: Thu, 19 May 2011 20:14:36 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: Per von Zweigbergk <pvz@itassistans.se>
Message-ID: <20110519181436.GB2100@garage.freebsd.pl>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="oLBj+sq0vYjzfsbl"
Content-Disposition: inline
In-Reply-To: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
X-OS: FreeBSD 9.0-CURRENT amd64
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL 
	autolearn=no version=3.0.4
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 18:15:06 -0000


--oLBj+sq0vYjzfsbl
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Wed, May 18, 2011 at 08:13:13AM +0200, Per von Zweigbergk wrote:
> I've been investigating HAST as a possibility in adding synchronous repli=
cation and failover to a set of two NFS servers backed by NFS. The servers =
themselves contain quite a few disks. 20 of them (7200 RPM SAS disks), to b=
e exact. (If I didn't lose count again...) Plus two quick but small SSD's f=
or ZIL and two not-as-quick but larger SSD's for L2ARC.
[...]

The configuration you should try first is to connect each disks pair
using HAST and create ZFS pool on top of those HAST devices.

Let's assume you have 4 data disks (da0-da3), 2 SSD disks for ZIL
(da4-da5) and 2 SSD disks for L2ARC (da6-da7).

Then you create the following HAST devices:

/dev/hast/data0 =3D MachineA(da0) + MachineB(da0)
/dev/hast/data1 =3D MachineA(da1) + MachineB(da1)
/dev/hast/data2 =3D MachineA(da2) + MachineB(da2)
/dev/hast/data3 =3D MachineA(da3) + MachineB(da3)

/dev/hast/slog0 =3D MachineA(da4) + MachineB(da4)
/dev/hast/slog1 =3D MachineA(da5) + MachineB(da5)

/dev/hast/cache0 =3D MachineA(da6) + MachineB(da6)
/dev/hast/cache1 =3D MachineA(da7) + MachineB(da7)

And then you create ZFS pool of your choice. Here you specify
redundancy, so if there is any you will have ZFS self-healing:

zpool create tank raidz1 hast/data{0,1,2,3} log mirror hast/slog{0,1} cache=
 hast/cache{0,1}

> 1. Hardware failure management. In case of a hardware failure, I'm not ex=
actly sure what will happen, but I suspect the single-disk RAID-0 array con=
taining the failed disk will simply fail. I assume it will still exist, but=
 refuse to be read or written. In this situation I understand HAST will han=
dle this by routing all I/O to the secondary server, in case the disk on th=
e primary side dies, or simply by cutting off replication if the disk on th=
e secondary server fails.

HAST sends all write requests to both nodes (if secondary is present)
and read requests only to primary node. In some cases reads can be send
to secondary node, for example when synchronization is in progress and
secondary has more recent data or reading from local disk failed (either
because of single EIO or entire disk went bad).

In other words HAST itself can handle one of the mirrored disk failure.

If entire hast/<resource> dies for some reason (eg. secondary is down
and local disk dies) then ZFS redundancy kicks in.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://yomoli.com

--oLBj+sq0vYjzfsbl
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAk3VXgsACgkQForvXbEpPzQU1QCfbfpiBAKH71tOMJMKfUSIwp7Y
WjMAn2R6hjssqi1y5oImzrgc0KrzAovY
=lZEY
-----END PGP SIGNATURE-----

--oLBj+sq0vYjzfsbl--

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 22:31:00 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 930E11065674;
	Thu, 19 May 2011 22:31:00 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id 410668FC1E;
	Thu, 19 May 2011 22:30:58 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id 64F5BC01CE;
	Fri, 20 May 2011 00:30:57 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id cvlyTlR+jlFf; Fri, 20 May 2011 00:30:56 +0200 (CEST)
Received: from [10.0.10.11] (unknown [212.112.191.49])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id DADEBC0181;
	Fri, 20 May 2011 00:30:56 +0200 (CEST)
Message-ID: <4DD59A1D.7010406@itassistans.se>
Date: Fri, 20 May 2011 00:30:53 +0200
From: Per von Zweigbergk <pvz@itassistans.se>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
	rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<4DD37C69.5020005@digsys.bg> <4DD3855E.8020802@itassistans.se>
	<20110519170740.GA2100@garage.freebsd.pl>
In-Reply-To: <20110519170740.GA2100@garage.freebsd.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 22:31:00 -0000

On 2011-05-19 19:07, Pawel Jakub Dawidek wrote:
> Having recursive ZFS pools is bad idea and most likely it was cause
> deadocks. You also pay all the costs with checksumming, ARC cache, etc.
> twice. Very bad idea.
I've considered this. Checksumming can be disabled in ZFS on the 
filesystem level (so I guess you could easily disable it for an entire 
pool). The ARC cannot be disabled on a filesystem or pool level though, 
as far as I can tell, only on the entire machine level, which seems like 
a bad idea.

Just the fact that there would be ARC duplication would be enough to 
make me seriously reconsider this.
>>> Some reported they used HAST for the SLOG as well. I do not know
>>> if using HAST for the L2ARC makes any sense. On failure you will
>>> import the pool on the slave node and this will wipe the L2ARC
>>> anyway.
>> Yes, running HAST on L2ARC doesn't make much sense, I'd have to run
>> HAST on the ZIL though if I opted for Variant 1 (which I don't think
>> I will).
> Using HAST for L2ARC devices might make no sense, but they are part of
> the pool. So if you import the pool on another machine L2ARC device will
> be failed. You may experiment with importing the pool, removing current
> L2ARC devices and attaching machine-local L2ARC devices. This way you
> avoid HAST for L2ARC, but not sure how reliable can that be.
The KISS way to solve this would be to simply add both of the local 
L2ARC devices. So no matter on which node you import it, you're going to 
get an L2ARC imported, and one of them in a failed status because it 
can't find it.

You'd have to live with the pool status being reported as degraded even 
though there is no problem though, which makes would make me inclined to 
simply script adding the L2ARC device when it is imported (and removing 
other cache devices), if that were the avenue I was pursuing.

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:03:52 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id AB224106566C;
	Thu, 19 May 2011 23:03:52 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id 3B1E78FC14;
	Thu, 19 May 2011 23:03:52 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id 11C19C01CE;
	Fri, 20 May 2011 01:03:51 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id 2QojWAKE3ypu; Fri, 20 May 2011 01:03:47 +0200 (CEST)
Received: from [10.0.10.11] (unknown [212.112.191.49])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id 07252C0181;
	Fri, 20 May 2011 01:03:47 +0200 (CEST)
Message-ID: <4DD5A1CF.70807@itassistans.se>
Date: Fri, 20 May 2011 01:03:43 +0200
From: Per von Zweigbergk <pvz@itassistans.se>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US;
	rv:1.9.2.17) Gecko/20110414 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
In-Reply-To: <20110519181436.GB2100@garage.freebsd.pl>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:03:52 -0000

On 2011-05-19 20:14, Pawel Jakub Dawidek wrote:
> On Wed, May 18, 2011 at 08:13:13AM +0200, Per von Zweigbergk wrote:
>> I've been investigating HAST as a possibility in adding synchronous replication and failover to a set of two NFS servers backed by NFS. The servers themselves contain quite a few disks. 20 of them (7200 RPM SAS disks), to be exact. (If I didn't lose count again...) Plus two quick but small SSD's for ZIL and two not-as-quick but larger SSD's for L2ARC.
> [...]
>
> The configuration you should try first is to connect each disks pair
> using HAST and create ZFS pool on top of those HAST devices.
>
> Let's assume you have 4 data disks (da0-da3), 2 SSD disks for ZIL
> (da4-da5) and 2 SSD disks for L2ARC (da6-da7).
>
> Then you create the following HAST devices:
>
> /dev/hast/data0 = MachineA(da0) + MachineB(da0)
> /dev/hast/data1 = MachineA(da1) + MachineB(da1)
> /dev/hast/data2 = MachineA(da2) + MachineB(da2)
> /dev/hast/data3 = MachineA(da3) + MachineB(da3)
>
> /dev/hast/slog0 = MachineA(da4) + MachineB(da4)
> /dev/hast/slog1 = MachineA(da5) + MachineB(da5)
>
> /dev/hast/cache0 = MachineA(da6) + MachineB(da6)
> /dev/hast/cache1 = MachineA(da7) + MachineB(da7)
>
> And then you create ZFS pool of your choice. Here you specify
> redundancy, so if there is any you will have ZFS self-healing:
>
> zpool create tank raidz1 hast/data{0,1,2,3} log mirror hast/slog{0,1} cache hast/cache{0,1}
Raidz on top of hast is one possibility, although raidz does add 
overhead to the equation. I'll have to find out how much. It's also 
possible to just mirror twice as well, although that would essentially 
mean that every write would go over the wire twice. Raidz might be the 
better bargain here, that would only increase the number of writes on 
the write by a factor 1/n where n is the number of data drives, at the 
cost of CPU to calculate parity. Testing will tell.
>> 1. Hardware failure management. In case of a hardware failure, I'm not exactly sure what will happen, but I suspect the single-disk RAID-0 array containing the failed disk will simply fail. I assume it will still exist, but refuse to be read or written. In this situation I understand HAST will handle this by routing all I/O to the secondary server, in case the disk on the primary side dies, or simply by cutting off replication if the disk on the secondary server fails.
> HAST sends all write requests to both nodes (if secondary is present)
> and read requests only to primary node. In some cases reads can be send
> to secondary node, for example when synchronization is in progress and
> secondary has more recent data or reading from local disk failed (either
> because of single EIO or entire disk went bad).
>
> In other words HAST itself can handle one of the mirrored disk failure.
>
> If entire hast/<resource>  dies for some reason (eg. secondary is down
> and local disk dies) then ZFS redundancy kicks in.
Very well, that is how failures are handled. But how do we *recover* 
from a disk failure? Without taking the entire server down that is.

I already know how to deal with my HBA to hot-add and hot-remove 
devices. And how to deal with hardware failures on the *secondary* node 
seems fairly straightforward, after all, it doesn't really matter if the 
mirroring becomes degraded for a few seconds while I futz around with 
restarting hastd and such. The primary sees the secondary disappear a 
few seconds, when it comes back, it will just truck all of the dirty 
data over. Big deal.

But what if the drive fails on the primary side? On the primary server I 
can't just restart hastd at my leisure, the underlying filesystem relies 
on it not going away. Ideally I'd want to just be able to tell hast that 
"hey, there's a new drive you can use, just suck over all the data from 
the secondary onto this drive, and route I/O from the secondary in the 
meantime" - without restarting hastd. Is this possible?

Of course I could just avoid the problem by failing over the entire 
server whenever I want to replace hardware on the primary, making it the 
secondary. But causing a 20 second (just guessing about the actual 
failover time here) I/O hiccup in my virtualization environment just 
because I want to swap a hard drive seems unreasonable.

These unresolved questions is why I would feel safer in simply running 
ZFS on the metal and running HAST on Zvols. :-) If running ZFS on top of 
a Zvol is a bad idea, there is always the option of simply exporting the 
HAST resource backed by Zvols as an iSCSI target and run VMFS on the 
drives. But that does mean losing some of the cooler features of ZFS 
which is a shame.

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:09:51 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 28E2D1065673
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 23:09:51 +0000 (UTC)
	(envelope-from pawel@dawidek.net)
Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60])
	by mx1.freebsd.org (Postfix) with ESMTP id C610E8FC18
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 23:09:50 +0000 (UTC)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 7F10D45E86; Fri, 20 May 2011 01:09:48 +0200 (CEST)
Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id 4B3D145CDC;
	Fri, 20 May 2011 01:09:43 +0200 (CEST)
Date: Fri, 20 May 2011 01:09:21 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: Per von Zweigbergk <pvz@itassistans.se>
Message-ID: <20110519230921.GF2100@garage.freebsd.pl>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="OZkY3AIuv2LYvjdk"
Content-Disposition: inline
In-Reply-To: <4DD5A1CF.70807@itassistans.se>
X-OS: FreeBSD 9.0-CURRENT amd64
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL 
	autolearn=no version=3.0.4
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:09:51 -0000


--OZkY3AIuv2LYvjdk
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote:
> Very well, that is how failures are handled. But how do we *recover*
> from a disk failure? Without taking the entire server down that is.

HAST opens local disk only when changing role to primary or changing
role to secondary and accepting connection from primary.
If your disk fails, switch to init for that HAST device, replace you
disk, call 'hastctl create <resource>' and switch back to primary or
secondary.

> I already know how to deal with my HBA to hot-add and hot-remove
> devices. And how to deal with hardware failures on the *secondary*
> node seems fairly straightforward, after all, it doesn't really
> matter if the mirroring becomes degraded for a few seconds while I
> futz around with restarting hastd and such. The primary sees the
> secondary disappear a few seconds, when it comes back, it will just
> truck all of the dirty data over. Big deal.

You don't need to restart hastd or stop secondary. Just use hastctl to
change role to init for the failing resource.

> But what if the drive fails on the primary side? On the primary
> server I can't just restart hastd at my leisure, the underlying
> filesystem relies on it not going away. Ideally I'd want to just be
> able to tell hast that "hey, there's a new drive you can use, just
> suck over all the data from the secondary onto this drive, and route
> I/O from the secondary in the meantime" - without restarting hastd.
> Is this possible?

Yes.

> These unresolved questions is why I would feel safer in simply
> running ZFS on the metal and running HAST on Zvols. :-) If running
> ZFS on top of a Zvol is a bad idea, there is always the option of
> simply exporting the HAST resource backed by Zvols as an iSCSI
> target and run VMFS on the drives. But that does mean losing some of
> the cooler features of ZFS which is a shame.

I'd suggest to test configuration that seems best in theory and then see
if it works for your or not. If not, then we can wonder what to do next.

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://yomoli.com

--OZkY3AIuv2LYvjdk
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAk3VoyEACgkQForvXbEpPzRFBQCeJxsKlLR3h7/8+X9nHfVmKpXO
EzIAoJWya2Cp6o58JnOENxViv3QRFkPX
=tjW/
-----END PGP SIGNATURE-----

--OZkY3AIuv2LYvjdk--

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:22:59 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DB83E1065670;
	Thu, 19 May 2011 23:22:58 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from mail-yx0-f182.google.com (mail-yx0-f182.google.com
	[209.85.213.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 6C8CE8FC12;
	Thu, 19 May 2011 23:22:58 +0000 (UTC)
Received: by yxl31 with SMTP id 31so1427409yxl.13
	for <multiple recipients>; Thu, 19 May 2011 16:22:58 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=WT7LfSUGZjGBQDHfJxmgmjseHDgVbSiZMZ4g+ma8V9A=;
	b=TtRMCQb8hEOhu3hSkGj+LoSJxSoKD2oUkZibBE8/AUIltro85SdoJ6g9Tv38mdF8vV
	Tu6hjLkiityPbfsSdJ5eo1nK/0/d1+aQOl1xH00yTxtevy9ycoNAkuFrU4tG6CjQ0kYf
	TGgWXotBWdS2oK4Mg6ffOM/59swPdiNFAiZIw=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=D04O2Ce5UiWVQXWXkz0mOk8VyO4uOu4VZfHNLuTdofMxxURYT9emGjUfYWCeVBT6Li
	I3T43uJzqDAPCXAqTnyYPMh0JJvnLPD5CPOXn6hYejD9UtpYcR9A9185hbuZlQQu7DiP
	MAzHHglLQqnGgXaAlgPN1p5el/iIqEar//GLM=
MIME-Version: 1.0
Received: by 10.90.147.18 with SMTP id u18mr251911agd.95.1305847377816; Thu,
	19 May 2011 16:22:57 -0700 (PDT)
Received: by 10.90.138.17 with HTTP; Thu, 19 May 2011 16:22:57 -0700 (PDT)
In-Reply-To: <20110519230921.GF2100@garage.freebsd.pl>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
	<20110519230921.GF2100@garage.freebsd.pl>
Date: Thu, 19 May 2011 16:22:57 -0700
Message-ID: <BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
From: Freddie Cash <fjwcash@gmail.com>
To: Pawel Jakub Dawidek <pjd@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:22:59 -0000

On Thu, May 19, 2011 at 4:09 PM, Pawel Jakub Dawidek <pjd@freebsd.org>
wrote:
> On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote:
>> Very well, that is how failures are handled. But how do we *recover*
>> from a disk failure? Without taking the entire server down that is.
>
> HAST opens local disk only when changing role to primary or changing
> role to secondary and accepting connection from primary.
> If your disk fails, switch to init for that HAST device, replace you
> disk, call 'hastctl create <resource>' and switch back to primary or
> secondary.
>
>> I already know how to deal with my HBA to hot-add and hot-remove
>> devices. And how to deal with hardware failures on the *secondary*
>> node seems fairly straightforward, after all, it doesn't really
>> matter if the mirroring becomes degraded for a few seconds while I
>> futz around with restarting hastd and such. The primary sees the
>> secondary disappear a few seconds, when it comes back, it will just
>> truck all of the dirty data over. Big deal.
>
> You don't need to restart hastd or stop secondary. Just use hastctl to
> change role to init for the failing resource.

This process works exceedingly well.  Just went through it a week or so
ago.  You just need to think in layers, the way GEOM works:

Non-HAST setup                       HAST setup
------------------                   ------------------
<disk>                               <disk>
<controller>                         <controller>
<glabel, gpt, etc>                   <glabel, gpt, etc>
<zfs>                                <hast>
                                     <zfs>

The non-HAST process for replacing a disk in a ZFS pool is:
 - zpool offline poolname diskname
 - remove dead disk
 - insert new disk
 - partition, label, etc as needed
 - zpool replace poolname olddisk newdisk
 - wait for resilver to complete

With HAST, there's only a couple of small changes needed:
 - zpool offline poolname diskname        <-- removes the /dev/hast node
from the pool
 - hastctl role init diskname             <-- removes the /dev/hast node
 - remove dead disk
 - insert new disk
 - partition, label, etc as needed
 - hastctl role create diskname           <-- creates the hast resource
 - hastctl role primary diskname          <-- creates the new /dev/hast node
 - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to
pool
 - wait for resilver to complete

The downside to this setup is that the data on the disk in the secondary
node is lost, as the resilver of the disk on the primary node recreates all
the data on the secondary node.  But, at least then you know the data is
good on both disks in the HAST resource.

-- 
Freddie Cash
fjwcash@gmail.com

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:26:19 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 966481065674
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 23:26:19 +0000 (UTC)
	(envelope-from pawel@dawidek.net)
Received: from mail.garage.freebsd.pl (60.wheelsystems.com [83.12.187.60])
	by mx1.freebsd.org (Postfix) with ESMTP id 3F2478FC0C
	for <freebsd-fs@freebsd.org>; Thu, 19 May 2011 23:26:19 +0000 (UTC)
Received: by mail.garage.freebsd.pl (Postfix, from userid 65534)
	id 0B9C045685; Fri, 20 May 2011 01:26:18 +0200 (CEST)
Received: from localhost (89-73-195-149.dynamic.chello.pl [89.73.195.149])
	(using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by mail.garage.freebsd.pl (Postfix) with ESMTP id 0255A45EA4;
	Fri, 20 May 2011 01:26:12 +0200 (CEST)
Date: Fri, 20 May 2011 01:25:51 +0200
From: Pawel Jakub Dawidek <pjd@FreeBSD.org>
To: Freddie Cash <fjwcash@gmail.com>
Message-ID: <20110519232551.GG2100@garage.freebsd.pl>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
	<20110519230921.GF2100@garage.freebsd.pl>
	<BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
MIME-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="yRA+Bmk8aPhU85Qt"
Content-Disposition: inline
In-Reply-To: <BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
X-OS: FreeBSD 9.0-CURRENT amd64
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Checker-Version: SpamAssassin 3.0.4 (2005-06-05) on 
	mail.garage.freebsd.pl
X-Spam-Level: 
X-Spam-Status: No, score=-0.6 required=4.5 tests=BAYES_00,RCVD_IN_SORBS_DUL 
	autolearn=no version=3.0.4
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:26:19 -0000


--yRA+Bmk8aPhU85Qt
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

On Thu, May 19, 2011 at 04:22:57PM -0700, Freddie Cash wrote:
> With HAST, there's only a couple of small changes needed:
>  - zpool offline poolname diskname        <-- removes the /dev/hast node
> from the pool
>  - hastctl role init diskname             <-- removes the /dev/hast node
>  - remove dead disk
>  - insert new disk
>  - partition, label, etc as needed
>  - hastctl role create diskname           <-- creates the hast resource
>  - hastctl role primary diskname          <-- creates the new /dev/hast n=
ode
>  - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to
> pool
>  - wait for resilver to complete
>=20
> The downside to this setup is that the data on the disk in the secondary
> node is lost, as the resilver of the disk on the primary node recreates a=
ll
> the data on the secondary node.  But, at least then you know the data is
> good on both disks in the HAST resource.

It shouldn't be the case. Primary HAST node should synchronize data from
secondary HAST node, as primary has new disk. This should allow you to
simply 'zpool online poolname disk' instead of replacing it.
It doesn't work that way for you?

--=20
Pawel Jakub Dawidek                       http://www.wheelsystems.com
FreeBSD committer                         http://www.FreeBSD.org
Am I Evil? Yes, I Am!                     http://yomoli.com

--yRA+Bmk8aPhU85Qt
Content-Type: application/pgp-signature

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (FreeBSD)

iEYEARECAAYFAk3Vpv8ACgkQForvXbEpPzRzYgCg0c70YunwrcHfbE9BGx7QvDAz
pl8AnRZlWsk6AINDg6wREmHSWwyd/jNm
=XkmN
-----END PGP SIGNATURE-----

--yRA+Bmk8aPhU85Qt--

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:27:36 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 5290C106566B;
	Thu, 19 May 2011 23:27:36 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id 0123C8FC19;
	Thu, 19 May 2011 23:27:35 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id C6B23C01CE;
	Fri, 20 May 2011 01:27:34 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id D7ZCZRyCgKYj; Fri, 20 May 2011 01:27:34 +0200 (CEST)
Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se
	[213.89.160.61])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id 306CCC0181;
	Fri, 20 May 2011 01:27:34 +0200 (CEST)
Mime-Version: 1.0 (Apple Message framework v1084)
From: Per von Zweigbergk <pvz@itassistans.se>
In-Reply-To: <BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
Date: Fri, 20 May 2011 01:27:32 +0200
Message-Id: <5B27EAAB-5D23-4844-B7C7-F83289BCABE7@itassistans.se>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
	<20110519230921.GF2100@garage.freebsd.pl>
	<BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
To: Freddie Cash <fjwcash@gmail.com>
X-Mailer: Apple Mail (2.1084)
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:27:36 -0000

20 maj 2011 kl. 01.22 skrev Freddie Cash:

> With HAST, there's only a couple of small changes needed:
>  - zpool offline poolname diskname        <-- removes the /dev/hast =
node from the pool

What you're describing here is not what I asked about, activating a hot =
spare drive without bringing down the HAST resource.

You're describing taking the entire array offline while you perform work =
on it. Which is fine in a lot of cases, but not exactly what I'd call =
HA. :-)=

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:28:07 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BF1E31065673;
	Thu, 19 May 2011 23:28:07 +0000 (UTC)
	(envelope-from fjwcash@gmail.com)
Received: from mail-yi0-f54.google.com (mail-yi0-f54.google.com
	[209.85.218.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 1F6748FC24;
	Thu, 19 May 2011 23:28:07 +0000 (UTC)
Received: by yie12 with SMTP id 12so1426295yie.13
	for <multiple recipients>; Thu, 19 May 2011 16:28:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=ym8u8IGtfSN/rRKs6yU+hYsBIJeYFC6VGW1jWFWxQu0=;
	b=uKWjgXfFLCWGiek7IZHTk9TYnrr0j5vbLWGJrn45yAzMqkuFkuA45q1euPaQzJKdwK
	CyuMHtX+pV8Ly7G6mDHR1a54sTH6HqoKOT01qwGSrWG7CbUObxJ7nnGQK6OLEzqia3IF
	wafelRF/sylFZtgTEvQKy3sUTynJmuCaDR1KQ=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=Ox9r4SnS2COjOVXzPfC8EyqmNFqYaE5qZlYCT+GxuuKe33U0dntH0GP0+0NhzkyoCw
	CMapFNcm708yq3aSBHN93rwthMMLQ1eBmd52NItcrYhhvUijY/dK4z4lNRDqkH9903t/
	Yss1tOzoHgda7OFiq3vnG/AB8vaQMqzDJAZdI=
MIME-Version: 1.0
Received: by 10.90.147.18 with SMTP id u18mr257479agd.95.1305847686607; Thu,
	19 May 2011 16:28:06 -0700 (PDT)
Received: by 10.90.138.17 with HTTP; Thu, 19 May 2011 16:28:06 -0700 (PDT)
In-Reply-To: <20110519232551.GG2100@garage.freebsd.pl>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
	<20110519230921.GF2100@garage.freebsd.pl>
	<BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
	<20110519232551.GG2100@garage.freebsd.pl>
Date: Thu, 19 May 2011 16:28:06 -0700
Message-ID: <BANLkTik6=QFyKzAOCTJhCi0Fjrx8NQdNsg@mail.gmail.com>
From: Freddie Cash <fjwcash@gmail.com>
To: Pawel Jakub Dawidek <pjd@freebsd.org>
Content-Type: text/plain; charset=UTF-8
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:28:07 -0000

On Thu, May 19, 2011 at 4:25 PM, Pawel Jakub Dawidek <pjd@freebsd.org>wrote:

> On Thu, May 19, 2011 at 04:22:57PM -0700, Freddie Cash wrote:
> > With HAST, there's only a couple of small changes needed:
> >  - zpool offline poolname diskname        <-- removes the /dev/hast node
> > from the pool
> >  - hastctl role init diskname             <-- removes the /dev/hast node
> >  - remove dead disk
> >  - insert new disk
> >  - partition, label, etc as needed
> >  - hastctl role create diskname           <-- creates the hast resource
> >  - hastctl role primary diskname          <-- creates the new /dev/hast
> node
> >  - zpool replace poolname olddisk newdisk <-- adds the /dev/hast node to
> > pool
> >  - wait for resilver to complete
> >
> > The downside to this setup is that the data on the disk in the secondary
> > node is lost, as the resilver of the disk on the primary node recreates
> all
> > the data on the secondary node.  But, at least then you know the data is
> > good on both disks in the HAST resource.
>
> It shouldn't be the case. Primary HAST node should synchronize data from
> secondary HAST node, as primary has new disk. This should allow you to
> simply 'zpool online poolname disk' instead of replacing it.
> It doesn't work that way for you?
>

Oh?  Never thought to try that.  But, I guess that does make sense ... and
is the point of having the redundant data in the other server ...

Also, in my tests, I was running a degraded HAST setup (only 1 server), so
it wouldn't have been possible to do.

Will have to remember that for the next time I'm playing with HAST (the box
is currently a non-HAST setup).

-- 
Freddie Cash
fjwcash@gmail.com

From owner-freebsd-fs@FreeBSD.ORG  Thu May 19 23:31:49 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2AFC8106564A;
	Thu, 19 May 2011 23:31:49 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id D20768FC1C;
	Thu, 19 May 2011 23:31:48 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id 82310C01CE;
	Fri, 20 May 2011 01:31:47 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id upmAsX0955-T; Fri, 20 May 2011 01:31:47 +0200 (CEST)
Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se
	[213.89.160.61])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id 07771C0181;
	Fri, 20 May 2011 01:31:47 +0200 (CEST)
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset=us-ascii
From: Per von Zweigbergk <pvz@itassistans.se>
In-Reply-To: <20110519230921.GF2100@garage.freebsd.pl>
Date: Fri, 20 May 2011 01:31:46 +0200
Content-Transfer-Encoding: 7bit
Message-Id: <D2345C4B-D9DF-40E6-BCFF-139F79DE979D@itassistans.se>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
	<20110519230921.GF2100@garage.freebsd.pl>
To: Pawel Jakub Dawidek <pjd@FreeBSD.org>
X-Mailer: Apple Mail (2.1084)
Cc: freebsd-fs@freebsd.org
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 19 May 2011 23:31:49 -0000

20 maj 2011 kl. 01.09 skrev Pawel Jakub Dawidek:

> On Fri, May 20, 2011 at 01:03:43AM +0200, Per von Zweigbergk wrote:
>> Very well, that is how failures are handled. But how do we *recover*
>> from a disk failure? Without taking the entire server down that is.
> 
> HAST opens local disk only when changing role to primary or changing
> role to secondary and accepting connection from primary.
> If your disk fails, switch to init for that HAST device, replace you
> disk, call 'hastctl create <resource>' and switch back to primary or
> secondary.

If I were to do 'hastctl role init foo' to switch from primary->init,
/dev/hast/foo would go away, and this would degrade whatever file system
or volume manager you're running on top of HAST. (I just tried this
in my HAST lab environment.)

The scenario I was describing was a primary disk failure, I want to
keep being able to access /dev/hast/foo while I replace the primary
disk.

I still don't see how it's possible to hot-replace a failed drive in
the server that's primary at the time, there just doesn't seem to be
any way of bringing in a new disk on the primary side without bringing
down the HAST resource.

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 00:11:12 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id A94D11065673;
	Fri, 20 May 2011 00:11:12 +0000 (UTC)
	(envelope-from pvz@itassistans.se)
Received: from zcs1.itassistans.net (zcs1.itassistans.net [212.112.191.37])
	by mx1.freebsd.org (Postfix) with ESMTP id 3535A8FC1A;
	Fri, 20 May 2011 00:11:12 +0000 (UTC)
Received: from localhost (localhost [127.0.0.1])
	by zcs1.itassistans.net (Postfix) with ESMTP id 2E5A2C01CE;
	Fri, 20 May 2011 02:11:11 +0200 (CEST)
X-Virus-Scanned: amavisd-new at zcs1.itassistans.net
Received: from zcs1.itassistans.net ([127.0.0.1])
	by localhost (zcs1.itassistans.net [127.0.0.1]) (amavisd-new,
	port 10024)
	with ESMTP id j359LKl7CxRR; Fri, 20 May 2011 02:11:07 +0200 (CEST)
Received: from [192.168.1.239] (c213-89-160-61.bredband.comhem.se
	[213.89.160.61])
	by zcs1.itassistans.net (Postfix) with ESMTPSA id 445E6C0181;
	Fri, 20 May 2011 02:11:07 +0200 (CEST)
Mime-Version: 1.0 (Apple Message framework v1084)
From: Per von Zweigbergk <pvz@itassistans.se>
In-Reply-To: <5B27EAAB-5D23-4844-B7C7-F83289BCABE7@itassistans.se>
Date: Fri, 20 May 2011 02:11:06 +0200
Message-Id: <61D2B7A3-1778-4A42-8983-8C325D2F849E@itassistans.se>
References: <85EC77D3-116E-43B0-BFF1-AE1BD71B5CE9@itassistans.se>
	<20110519181436.GB2100@garage.freebsd.pl>
	<4DD5A1CF.70807@itassistans.se>
	<20110519230921.GF2100@garage.freebsd.pl>
	<BANLkTi=1psNnEOFxD1YEmuNAHRDyXBdBfw@mail.gmail.com>
	<5B27EAAB-5D23-4844-B7C7-F83289BCABE7@itassistans.se>
To: Per von Zweigbergk <pvz@itassistans.se>
X-Mailer: Apple Mail (2.1084)
Content-Type: text/plain;
	charset=us-ascii
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.5
Cc: freebsd-fs@freebsd.org, Pawel Jakub Dawidek <pjd@freebsd.org>
Subject: Re: HAST + ZFS self healing? Hot spares?
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 00:11:12 -0000

20 maj 2011 kl. 01.27 skrev Per von Zweigbergk:

> You're describing taking the entire array offline while you perform =
work on it.

My apologies, I was a bit too quick reading what you (Freddie Cash) =
wrote.

What you're describing is relying on ZFS's own redundancy while you =
replace the failed disk, bringing down the entire HAST resource just so =
you can replace one of the two failed disks. The only reason the ZFS =
array continues to function is because it's redundant in ZFS itself.

Ideally, the HAST resource could continue to remain operational while =
the failed disk was replaced. After all, it can remain operational while =
the primary disk has failed, and it can remain operational while the =
data is being resynchronized, so why would the resource need to be =
brought down just to transition between these two states?

I guess it's because HAST isn't quite "finished" yet feature-wise, and =
that particular feature does not yet exist.

Still, I suppose this is good enough, this just shows that raidz:ing =
together a bunch of HAST mirrors solves one and a half of my operational =
problems - replacing failed drives (by momentarily downing the whole =
HAST resource while work is being done) and providing checksumming =
capability (although not self-healing).

The setup described (a bunch of HAST mirrors in a raidz) will not =
self-heal entirely. Imagine if a bit error occurred while writing to one =
of the secondary disks. Since that data is never read by ZFS or HAST, =
the error would remain undetected. To ensure data integrity on both the =
primary and secondary servers, you'd have to failover the servers once =
every N days/weeks/months (depending on your operational requirements) =
and perform a zfs scrub on "both sides" of the HAST resource, as part of =
regular maintenance. It'd probably even be scriptable, assuming you can =
live with a few seconds of scheduled downtime during the switchover.=

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 03:17:23 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 41877106564A
	for <freebsd-fs@freebsd.org>; Fri, 20 May 2011 03:17:23 +0000 (UTC)
	(envelope-from mckusick@mckusick.com)
Received: from chez.mckusick.com (chez.mckusick.com [64.81.247.49])
	by mx1.freebsd.org (Postfix) with ESMTP id 1C3778FC16
	for <freebsd-fs@freebsd.org>; Fri, 20 May 2011 03:17:23 +0000 (UTC)
Received: from chez.mckusick.com (localhost [127.0.0.1])
	by chez.mckusick.com (8.14.3/8.14.3) with ESMTP id p4K3G6EU039569;
	Thu, 19 May 2011 20:16:06 -0700 (PDT)
	(envelope-from mckusick@chez.mckusick.com)
Message-Id: <201105200316.p4K3G6EU039569@chez.mckusick.com>
To: lev@freebsd.org
In-reply-to: <1606289061.20110519211755@serebryakov.spb.ru> 
Date: Thu, 19 May 2011 20:16:06 -0700
From: Kirk McKusick <mckusick@mckusick.com>
X-Spam-Status: No, score=1.3 required=5.0 tests=MISSING_MID,PLING_QUERY,
	UNPARSEABLE_RELAY autolearn=no version=3.2.5
X-Spam-Level: *
X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on chez.mckusick.com
Cc: freebsd-fs@freebsd.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?! 
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 03:17:23 -0000

> Date: Thu, 19 May 2011 21:17:55 +0400
> From: Lev Serebryakov <lev@freebsd.org>
> To: freebsd-fs@freebsd.org
> 
> Hello, Freebsd-fs.
> 
>   I have /usr/home partition on my new server which is 400GiB (only
> 17GiB is used). It is UFS2, SoftUpdates are enabled.
> 
>   I want to backup it on live system, but 4 times out of 5 I got
> (after 10-12 minutes of wait! Oh my, 10 minutes to create snapshot!):
> 
> mksnap_ffs: Cannot create snapshot /usr/home/.snap/dump_snapshot: Resource temporarily unavailable
> dump: Cannot create /usr/home/.snap/dump_snapshot: No such file or directory
> 
>   It is FreeBSD 8.2-STABLE/amd64, 8GiB of memory.
> 
>   I've never encounter such problem on previous server, which has
>  about 80GiB (with 20GiB used).
> 
> -- 
> // Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>
> 
> _______________________________________________
> freebsd-fs@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-fs
> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org"

Given the size of your storage, you should consider using ZFS
which is better able to handle such large systems better.

My second suggestion is that you try building UFS2 with 32K
blocks and 4K fragments. That will reduce the number of resources
needed to take the snapshot.

	Kirk McKusick

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 05:54:38 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BEBD11065672
	for <freebsd-fs@FreeBSD.org>; Fri, 20 May 2011 05:54:38 +0000 (UTC)
	(envelope-from bf1783@googlemail.com)
Received: from mail-vx0-f182.google.com (mail-vx0-f182.google.com
	[209.85.220.182])
	by mx1.freebsd.org (Postfix) with ESMTP id 78CD08FC12
	for <freebsd-fs@FreeBSD.org>; Fri, 20 May 2011 05:54:38 +0000 (UTC)
Received: by vxc34 with SMTP id 34so3378580vxc.13
	for <freebsd-fs@FreeBSD.org>; Thu, 19 May 2011 22:54:34 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=googlemail.com; s=gamma;
	h=domainkey-signature:mime-version:reply-to:date:message-id:subject
	:from:to:cc:content-type;
	bh=IRvRePyx14hJhHOFdTqcHERjoeNNtdFkDt5yPxbErDk=;
	b=RAe6kGuDu4YvhlKIFOHb74blzXHvHoZEI3neQFHum3RnGMAjQog4Zkdc+ny+0Tm71x
	1lMq9A6zHy6tXT2ED/dnNfnIFVHesyG+h07OL2UT6o6LEVcfoNA36SQlAaDZLLPIcswP
	8+YLXjd0KnOENsqn+xqjvv+jUN0jLE1OdCMM4=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=googlemail.com; s=gamma;
	h=mime-version:reply-to:date:message-id:subject:from:to:cc
	:content-type;
	b=nN+XvlGtbw4Gfzs06Qo/j96zTL3DwB7h0n9h8yKMWPl2AIaYRbr6W7+Uwy4ZlwJYE0
	M+QGFFKqxWm+9LnNbdN9xT8rqcto/bOruVcDZVcH+zoSXHw5tDl9G6I9XQg5cLOyMF7+
	8dMHRJh1Xqal9hQUgI1ygp5/4ML1rE0G5K5Lw=
MIME-Version: 1.0
Received: by 10.52.97.7 with SMTP id dw7mr5762671vdb.109.1305869527287; Thu,
	19 May 2011 22:32:07 -0700 (PDT)
Received: by 10.52.110.231 with HTTP; Thu, 19 May 2011 22:32:07 -0700 (PDT)
Date: Fri, 20 May 2011 01:32:07 -0400
Message-ID: <BANLkTi=Va4jR_SM=VAu+gdN64O-LDUxiOA@mail.gmail.com>
From: "b. f." <bf1783@googlemail.com>
To: freebsd-questions@FreeBSD.org, freebsd-fs@FreeBSD.org
Content-Type: text/plain; charset=ISO-8859-1
Cc: grarpamp <grarpamp@gmail.com>
Subject: Re: UDF and DVD's
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: bf1783@gmail.com
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 05:54:38 -0000

grarpamp wrote:
...
> I'm guessing the current state within FreeBSD means that I can
> neither read, nor create, or write, readable (compatible) images
> at this, or any given, UDF level?
...
>
> Is this a blocker for FreeBSD?
>
> For me, at least, minimally, that seems to be the case... as I now
> have no way to rip, mount and add the files to this DVD that I would
> like to add. Except to use Windows, which I consider to be unreliable
> at best.

Obviously, the base system UDF support is minimal and needs some work.
 But you may find that ports like sysutils/cdrtools[-devel] or
sysutils/udfclient will allow you to do much of what you want to do.

b.

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 08:29:39 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 64C3C106566C
	for <freebsd-fs@freebsd.org>; Fri, 20 May 2011 08:29:39 +0000 (UTC)
	(envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
	[IPv6:2a01:4f8:131:60a2::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 020378FC1A
	for <freebsd-fs@freebsd.org>; Fri, 20 May 2011 08:29:39 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
	[IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3])
	(Authenticated sender: lev@serebryakov.spb.ru)
	by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 25E924AC1C; 
	Fri, 20 May 2011 12:29:37 +0400 (MSD)
Date: Fri, 20 May 2011 12:29:33 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <795474996.20110520122933@serebryakov.spb.ru>
To: Kirk McKusick <mckusick@mckusick.com>
In-Reply-To: <201105200316.p4K3G6EU039569@chez.mckusick.com>
References: <1606289061.20110519211755@serebryakov.spb.ru>
	<201105200316.p4K3G6EU039569@chez.mckusick.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@freebsd.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 08:29:39 -0000

Hello, Kirk.
You wrote 20 =EC=E0=FF 2011 =E3., 7:16:06:

> Given the size of your storage, you should consider using ZFS
> which is better able to handle such large systems better.
  Yes, I know, that everybody loves ZFS now, but it doesn't have two
 characteristics which is important for my installation:

  (1) nodump flag or any other way to mark directories and files as
  not-importand for backup. "zfs send" is all-or-nothing solution, and
  now my users use "nodump" to reduce backup sizes greatly.

  (2) Incremental backups with a little of local information (zfs send
  can send difference between snapshots, but system needs to store old
  snapshot for this).

  Second one is not so important yet, because there is a lot of free space,
  but "zfs send" could not do anything with (1) :(

    All other backups solutions doesn't store full FS information, as
  works on file level, not FS one :(

> My second suggestion is that you try building UFS2 with 32K
> blocks and 4K fragments. That will reduce the number of resources
> needed to take the snapshot.
  I'll try this. But I remember, that some time ago (about 7.1-STABLE)
 there was deadlock in kernel memory allocator when different UFSes
 on system uses different block sizes...

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 08:42:54 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3C84A106564A;
	Fri, 20 May 2011 08:42:54 +0000 (UTC)
	(envelope-from grarpamp@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 0CD598FC16;
	Fri, 20 May 2011 08:42:53 +0000 (UTC)
Received: by pwj8 with SMTP id 8so2076366pwj.13
	for <multiple recipients>; Fri, 20 May 2011 01:42:53 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=3COreA6r/gy3USqKbuADG1H/LNZh2ZVZkMbijgM34xs=;
	b=TofCb5dlSkRqQMxsHuYbSiesC60WzYyPrV5XvCRCY8gfCCp1hLu4XS/WcKnJA1tM4X
	IBgLepZdp8/dwNjIOcChs4N4anCU3015t99cWmsd2C0OHeWq22usXVu9W0yTT6S97teG
	+BV11DfuPezATJqfY2ERhlXJuz1i01blme6UU=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=IhAVzTnh13UQmoDotpBw05RrYazhN3iB94n4MN9aU7Y0uZSytdZD21L0pcqxGuv/wD
	kjbGR6Hepu6gNW+P6K5dRu6eC21ZQMUT0O+kEYzZgu3HzyGSVLCRZSdQNzR79zw48jyh
	y5GPLdZrJS4rUcpHobrEj5cDLIjRKA7g2q02Q=
MIME-Version: 1.0
Received: by 10.142.230.6 with SMTP id c6mr2585560wfh.415.1305880973697; Fri,
	20 May 2011 01:42:53 -0700 (PDT)
Received: by 10.142.157.2 with HTTP; Fri, 20 May 2011 01:42:53 -0700 (PDT)
In-Reply-To: <20110519091433.GA94053@icarus.home.lan>
References: <BANLkTins=3iJOhfu2SBCUrhsvKk58M8nPw@mail.gmail.com>
	<20110519091433.GA94053@icarus.home.lan>
Date: Fri, 20 May 2011 04:42:53 -0400
Message-ID: <BANLkTin09d429u6rFe0T4GsQxF5606ZiMw@mail.gmail.com>
From: grarpamp <grarpamp@gmail.com>
To: Jeremy Chadwick <freebsd@jdc.parodius.com>
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org
Subject: Re: UDF and DVD's
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 08:42:54 -0000

> Thoughts: please provide commands, full output, etc. that show how
> you're trying to mount the disc, as well as relevant /dev entries
> pertaining to your DVD drive.  dmesg might also be helpful.  And I
> assume you have looked at mount_udf(8)?

Apologies, it is late. However I used only the obvious.
Hopefully obviously, my DVD drive is irrelevant in this case...

mdconfig -f <image> -o readonly
mount_cd9660 -v -o ro <md dev> /mnt
ls -alR /mnt
[*not* 2.5GiB of files, but...]
cat /mnt/readme.txt
This disc contains a "UDF" file system and requires an operating system
that supports the ISO-13346 "UDF" file system specification.
umount -v /mnt
mount_udf -v -o ro <md dev> /mnt
mount_udf: /dev/md[n]: Invalid argument
 [md dev is not mounted on /mnt]

I think it's related to the UDF version of the image. As
anyone can verify using my said images found on the
internet, Perhaps it begs for a NetBSD port?

I tested with: RELENG_8 i386.

BTW, mdconfig is also broken in that it should take arguments
regardless of position, but it does not. IE: try transposing
-d and -u, or -o. = failure to execute.

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 08:47:18 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C3B081065670;
	Fri, 20 May 2011 08:47:18 +0000 (UTC)
	(envelope-from grarpamp@gmail.com)
Received: from mail-px0-f176.google.com (mail-px0-f176.google.com
	[209.85.212.176])
	by mx1.freebsd.org (Postfix) with ESMTP id 94B118FC0A;
	Fri, 20 May 2011 08:47:18 +0000 (UTC)
Received: by pxi11 with SMTP id 11so2855633pxi.7
	for <multiple recipients>; Fri, 20 May 2011 01:47:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:in-reply-to:references:date
	:message-id:subject:from:to:cc:content-type;
	bh=blirRTQCZDD6c1UGFgvOxQW4gvPoZVsdDCV4Gsm7AgU=;
	b=tWUFD3NBDGpWpf0hCcX75uN2DFFBOP5kv1rc1xIPhDX3sLcUM4JNQq8Ssghkm+zYcj
	5oR8qH9GVrjTlBu3k5YaCmOPdBa3o1fHxIxJosEyql3gIiKXLa9KVrkEntum8y3AHSEe
	LvAkHP1hsRzv7W3pNAvzxJcJYy7W69Xo+7j1E=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:in-reply-to:references:date:message-id:subject:from:to
	:cc:content-type;
	b=oXER/+4mOQC/cqQe85TJ7XrgW2hH9k44xDAi9elKhF8EXRKqZRyO/b+l+GSR1MIKxz
	gmB5kLQtdO1bQyjjjwqlTpWm4lbefEE0hc9prT+cDVLNiE3rJCaHzBAw6hqqBU8zn2Fm
	1f0yxWrG/0z31OL9fxc0xKYw4Ub1ND649qxM4=
MIME-Version: 1.0
Received: by 10.142.249.34 with SMTP id w34mr2435874wfh.301.1305881237993;
	Fri, 20 May 2011 01:47:17 -0700 (PDT)
Received: by 10.142.157.2 with HTTP; Fri, 20 May 2011 01:47:17 -0700 (PDT)
In-Reply-To: <BANLkTi=Va4jR_SM=VAu+gdN64O-LDUxiOA@mail.gmail.com>
References: <BANLkTi=Va4jR_SM=VAu+gdN64O-LDUxiOA@mail.gmail.com>
Date: Fri, 20 May 2011 04:47:17 -0400
Message-ID: <BANLkTi=sqR4XHfLdQ8UjGOo1Rh5ub6=v-g@mail.gmail.com>
From: grarpamp <grarpamp@gmail.com>
To: bf1783@gmail.com
Content-Type: text/plain; charset=UTF-8
Cc: freebsd-fs@freebsd.org, freebsd-questions@freebsd.org
Subject: Re: UDF and DVD's
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 08:47:18 -0000

> Obviously, the base system UDF support is minimal and needs some work.
>  But you may find that ports like sysutils/cdrtools[-devel] or
> sysutils/udfclient will allow you to do much of what you want to do.

Hmm. perhaps I may be able to create and burn [both modes occurring
in userland] with cdrtools. But certainly not to read or write in kernel mode
yet AFAICT. I'll investigate udfclient, that is new to me as a userland tool.
I was hoping for kernel level compatibility. As are, I suspect, we all :)

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 13:09:38 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 26C1C106566C;
	Fri, 20 May 2011 13:09:38 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 3AE9A8FC20;
	Fri, 20 May 2011 13:09:36 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id QAA00712;
	Fri, 20 May 2011 16:09:34 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Message-ID: <4DD6680E.9040006@FreeBSD.org>
Date: Fri, 20 May 2011 16:09:34 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.17) Gecko/20110504 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: lev@FreeBSD.org
References: <1606289061.20110519211755@serebryakov.spb.ru>	<201105200316.p4K3G6EU039569@chez.mckusick.com>
	<795474996.20110520122933@serebryakov.spb.ru>
In-Reply-To: <795474996.20110520122933@serebryakov.spb.ru>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit
Cc: Kirk McKusick <mckusick@mckusick.com>, freebsd-fs@FreeBSD.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
 /usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 13:09:38 -0000

on 20/05/2011 11:29 Lev Serebryakov said the following:
> Hello, Kirk.
> You wrote 20 ��� 2011 �., 7:16:06:
> 
>> Given the size of your storage, you should consider using ZFS
>> which is better able to handle such large systems better.
>   Yes, I know, that everybody loves ZFS now, but it doesn't have two
>  characteristics which is important for my installation:
> 
>   (1) nodump flag or any other way to mark directories and files as
>   not-importand for backup. "zfs send" is all-or-nothing solution, and
>   now my users use "nodump" to reduce backup sizes greatly.

Two options:
a) you don't have to zfs send all filesystems, just the ones that you really need;
and you can easily create many filesystems with ZFS; you can tag filesystems that
you do not want to backup with user properties.

b) you can use something else for backups

Besides, zfs send / receive best works for replicating data.  Storing results of
zfs send for later restoration is not a good idea, IMO.

>   (2) Incremental backups with a little of local information (zfs send
>   can send difference between snapshots, but system needs to store old
>   snapshot for this).
>   Second one is not so important yet, because there is a lot of free space,
>   but "zfs send" could not do anything with (1) :(
> 
>     All other backups solutions doesn't store full FS information, as
>   works on file level, not FS one :(

This sounds more like a theoretical than practical objection.
If you don't lose any information that you actually need, then a solution works.
Take a look at e.g. archivers/star.

>> My second suggestion is that you try building UFS2 with 32K
>> blocks and 4K fragments. That will reduce the number of resources
>> needed to take the snapshot.
>   I'll try this. But I remember, that some time ago (about 7.1-STABLE)
>  there was deadlock in kernel memory allocator when different UFSes
>  on system uses different block sizes...
> 


-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 16:45:55 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id F27A61065670;
	Fri, 20 May 2011 16:45:55 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
	[IPv6:2a01:4f8:131:60a2::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 903A88FC1B;
	Fri, 20 May 2011 16:45:55 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
	[IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3])
	(Authenticated sender: lev@serebryakov.spb.ru)
	by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id 8A0264AC1C; 
	Fri, 20 May 2011 20:45:53 +0400 (MSD)
Date: Fri, 20 May 2011 20:45:49 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <1408884696.20110520204549@serebryakov.spb.ru>
To: Andriy Gapon <avg@FreeBSD.org>
In-Reply-To: <4DD6680E.9040006@FreeBSD.org>
References: <1606289061.20110519211755@serebryakov.spb.ru>
	<201105200316.p4K3G6EU039569@chez.mckusick.com>
	<795474996.20110520122933@serebryakov.spb.ru>
	<4DD6680E.9040006@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 16:45:56 -0000

Hello, Andriy.
You wrote 20 =EC=E0=FF 2011 =E3., 17:09:34:

>>> Given the size of your storage, you should consider using ZFS
>>> which is better able to handle such large systems better.
>>   Yes, I know, that everybody loves ZFS now, but it doesn't have two
>>  characteristics which is important for my installation:
>>=20
>>   (1) nodump flag or any other way to mark directories and files as
>>   not-importand for backup. "zfs send" is all-or-nothing solution, and
>>   now my users use "nodump" to reduce backup sizes greatly.
> Two options:
> a) you don't have to zfs send all filesystems, just the ones that you rea=
lly need;
> and you can easily create many filesystems with ZFS; you can tag filesyst=
ems that
> you do not want to backup with user properties.
  Yes, _I_ can create many FSes. Not users. If user want to mark this
part of this site as non-important (for example, because it is cache
of it image gallery which stores thumbnails, which could take lot of
space but re-creatable on demand), I will need to create yet another
FS on his request. It is not a option.

> Take a look at e.g. archivers/star.
  I'll take a look. If it could skip some directories, marked with
special file (like gtar could), it could be a solution.

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 16:54:15 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 3DE74106566B;
	Fri, 20 May 2011 16:54:15 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id 3B34D8FC15;
	Fri, 20 May 2011 16:54:13 +0000 (UTC)
Received: from odyssey.starpoint.kiev.ua (alpha-e.starpoint.kiev.ua
	[212.40.38.101])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id TAA03196;
	Fri, 20 May 2011 19:54:11 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Message-ID: <4DD69CB3.2050601@FreeBSD.org>
Date: Fri, 20 May 2011 19:54:11 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.17) Gecko/20110504 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: lev@FreeBSD.org
References: <1606289061.20110519211755@serebryakov.spb.ru>
	<201105200316.p4K3G6EU039569@chez.mckusick.com>
	<795474996.20110520122933@serebryakov.spb.ru>
	<4DD6680E.9040006@FreeBSD.org>
	<1408884696.20110520204549@serebryakov.spb.ru>
In-Reply-To: <1408884696.20110520204549@serebryakov.spb.ru>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
 /usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 16:54:15 -0000

on 20/05/2011 19:45 Lev Serebryakov said the following:
> Hello, Andriy.
> You wrote 20 ��� 2011 �., 17:09:34:
>> Take a look at e.g. archivers/star.
>   I'll take a look. If it could skip some directories, marked with
> special file (like gtar could), it could be a solution.

I think that it understands FreeBSD flags and supports nodump flag.

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 18:19:18 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id 2B4FA106566C;
	Fri, 20 May 2011 18:19:18 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
	[IPv6:2a01:4f8:131:60a2::2])
	by mx1.freebsd.org (Postfix) with ESMTP id BBC2B8FC15;
	Fri, 20 May 2011 18:19:17 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
	[IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3])
	(Authenticated sender: lev@serebryakov.spb.ru)
	by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id E31474AC1C; 
	Fri, 20 May 2011 22:19:15 +0400 (MSD)
Date: Fri, 20 May 2011 22:19:11 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <1491112642.20110520221911@serebryakov.spb.ru>
To: Andriy Gapon <avg@FreeBSD.org>
In-Reply-To: <4DD69CB3.2050601@FreeBSD.org>
References: <1606289061.20110519211755@serebryakov.spb.ru>
	<201105200316.p4K3G6EU039569@chez.mckusick.com>
	<795474996.20110520122933@serebryakov.spb.ru>
	<4DD6680E.9040006@FreeBSD.org>
	<1408884696.20110520204549@serebryakov.spb.ru>
	<4DD69CB3.2050601@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 18:19:18 -0000

Hello, Andriy.
You wrote 20 =EC=E0=FF 2011 =E3., 20:54:11:

>> You wrote 20 =EC=E0=FF 2011 =E3., 17:09:34:
>>> Take a look at e.g. archivers/star.
>>   I'll take a look. If it could skip some directories, marked with
>> special file (like gtar could), it could be a solution.
> I think that it understands FreeBSD flags and supports nodump flag.
  I don't need star for FSes fwith "nodump" flag. star from FFS2
without snapshot is not very good solution in any case, IMHO.

  So, I need any-tar or any other solution with non-FS-specific "nodump"
 indication for ZFS OR working snapshots on FFS.

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 18:52:03 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id C1974106566C;
	Fri, 20 May 2011 18:52:03 +0000 (UTC) (envelope-from avg@FreeBSD.org)
Received: from citadel.icyb.net.ua (citadel.icyb.net.ua [212.40.38.140])
	by mx1.freebsd.org (Postfix) with ESMTP id D26378FC1C;
	Fri, 20 May 2011 18:52:02 +0000 (UTC)
Received: from porto.topspin.kiev.ua (porto-e.starpoint.kiev.ua
	[212.40.38.100])
	by citadel.icyb.net.ua (8.8.8p3/ICyb-2.3exp) with ESMTP id VAA04050;
	Fri, 20 May 2011 21:52:00 +0300 (EEST)
	(envelope-from avg@FreeBSD.org)
Received: from localhost.topspin.kiev.ua ([127.0.0.1])
	by porto.topspin.kiev.ua with esmtp (Exim 4.34 (FreeBSD))
	id 1QNUno-000HkV-HN; Fri, 20 May 2011 21:52:00 +0300
Message-ID: <4DD6B84F.20706@FreeBSD.org>
Date: Fri, 20 May 2011 21:51:59 +0300
From: Andriy Gapon <avg@FreeBSD.org>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; en-US;
	rv:1.9.2.17) Gecko/20110503 Lightning/1.0b2 Thunderbird/3.1.10
MIME-Version: 1.0
To: lev@FreeBSD.org
References: <1606289061.20110519211755@serebryakov.spb.ru>
	<201105200316.p4K3G6EU039569@chez.mckusick.com>
	<795474996.20110520122933@serebryakov.spb.ru>
	<4DD6680E.9040006@FreeBSD.org>
	<1408884696.20110520204549@serebryakov.spb.ru>
	<4DD69CB3.2050601@FreeBSD.org>
	<1491112642.20110520221911@serebryakov.spb.ru>
In-Reply-To: <1491112642.20110520221911@serebryakov.spb.ru>
X-Enigmail-Version: 1.1.2
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: 8bit
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
 /usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 18:52:03 -0000

on 20/05/2011 21:19 Lev Serebryakov said the following:
> Hello, Andriy.
> You wrote 20 ��� 2011 �., 20:54:11:
> 
>>> You wrote 20 ��� 2011 �., 17:09:34:
>>>> Take a look at e.g. archivers/star.
>>>   I'll take a look. If it could skip some directories, marked with
>>> special file (like gtar could), it could be a solution.
>> I think that it understands FreeBSD flags and supports nodump flag.
>   I don't need star for FSes fwith "nodump" flag. star from FFS2
> without snapshot is not very good solution in any case, IMHO.
> 
>   So, I need any-tar or any other solution with non-FS-specific "nodump"
>  indication for ZFS OR working snapshots on FFS.

$ chflags nodump work
$ ls -ldo  work
drwxr-xr-x  2 avg  staff  nodump 4 18 May  2008 work
$ df -T .
Filesystem             Type 1K-blocks    Used     Avail Capacity  Mounted on
pond/usr/home/avg/tmp  zfs  114487866 7629783 106858083     7%    /usr/home/avg/tmp

Does this look good enough for you?

-- 
Andriy Gapon

From owner-freebsd-fs@FreeBSD.ORG  Fri May 20 18:56:01 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@FreeBSD.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id D15AB1065673;
	Fri, 20 May 2011 18:56:01 +0000 (UTC) (envelope-from lev@FreeBSD.org)
Received: from onlyone.friendlyhosting.spb.ru (onlyone.friendlyhosting.spb.ru
	[IPv6:2a01:4f8:131:60a2::2])
	by mx1.freebsd.org (Postfix) with ESMTP id 93D9A8FC13;
	Fri, 20 May 2011 18:56:01 +0000 (UTC)
Received: from lion.home.serebryakov.spb.ru (unknown
	[IPv6:2001:470:923f:1:c0e1:7989:b1b9:78c3])
	(Authenticated sender: lev@serebryakov.spb.ru)
	by onlyone.friendlyhosting.spb.ru (Postfix) with ESMTPA id D6A064AC1C; 
	Fri, 20 May 2011 22:55:59 +0400 (MSD)
Date: Fri, 20 May 2011 22:55:55 +0400
From: Lev Serebryakov <lev@FreeBSD.org>
Organization: FreeBSD Project
X-Priority: 3 (Normal)
Message-ID: <1248410630.20110520225555@serebryakov.spb.ru>
To: Andriy Gapon <avg@FreeBSD.org>
In-Reply-To: <4DD6B84F.20706@FreeBSD.org>
References: <1606289061.20110519211755@serebryakov.spb.ru>
	<201105200316.p4K3G6EU039569@chez.mckusick.com>
	<795474996.20110520122933@serebryakov.spb.ru>
	<4DD6680E.9040006@FreeBSD.org>
	<1408884696.20110520204549@serebryakov.spb.ru>
	<4DD69CB3.2050601@FreeBSD.org>
	<1491112642.20110520221911@serebryakov.spb.ru>
	<4DD6B84F.20706@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=windows-1251
Content-Transfer-Encoding: quoted-printable
Cc: freebsd-fs@FreeBSD.org
Subject: Re: Snapshots fail on large FFS2 volumes regulary -- how to backup
	/usr/home?!
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: lev@FreeBSD.org
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 20 May 2011 18:56:01 -0000

Hello, Andriy.
You wrote 20 =EC=E0=FF 2011 =E3., 22:51:59:

> $ chflags nodump work
> $ ls -ldo  work
> drwxr-xr-x  2 avg  staff  nodump 4 18 May  2008 work
> $ df -T .
> Filesystem             Type 1K-blocks    Used     Avail Capacity  Mounted=
 on
> pond/usr/home/avg/tmp  zfs  114487866 7629783 106858083     7%    /usr/ho=
me/avg/tmp

> Does this look good enough for you?
 Ooops, I've missed, when flags were added to ZFS. Sorry :)

--=20
// Black Lion AKA Lev Serebryakov <lev@FreeBSD.org>


From owner-freebsd-fs@FreeBSD.ORG  Sat May 21 06:39:59 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id BFEB7106564A
	for <freebsd-fs@freebsd.org>; Sat, 21 May 2011 06:39:59 +0000 (UTC)
	(envelope-from grarpamp@gmail.com)
Received: from mail-pw0-f54.google.com (mail-pw0-f54.google.com
	[209.85.160.54])
	by mx1.freebsd.org (Postfix) with ESMTP id 9BCB98FC0A
	for <freebsd-fs@freebsd.org>; Sat, 21 May 2011 06:39:59 +0000 (UTC)
Received: by pwj8 with SMTP id 8so2578324pwj.13
	for <freebsd-fs@freebsd.org>; Fri, 20 May 2011 23:39:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma;
	h=domainkey-signature:mime-version:date:message-id:subject:from:to
	:content-type; bh=8v3M2pGzpY5zcrRJQ3NcDAPvuI8vzWT1PmpQ1mxao5U=;
	b=Qu44zcN636hp0EfnpoIvhcgbtfYEq56wEgJhtuU55AEXUtqvPZ4cTS7VdvP2GMi8w5
	7mnotyyFOZoNZiCgwG7XzrfezJ8K19wBba3TED7+zL0ebzNMm9O7pgf6Zn6VOybr72OP
	BEV72WBzP/tLAunOMJcrGXDZWFMoMqmJ1FTJA=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma;
	h=mime-version:date:message-id:subject:from:to:content-type;
	b=Bwj6EHCLiV6O0oEkGazNZJ1kkbhh2sjSTMCXuHUmMcVHPjYpc3T4ovnwfHDcSS8Xkq
	B8Um6t/TroiYJPdq+goFXxCeGNk4zT9O5PmsjhXdpB2iv94JehPHA22un64nvpbWq/ei
	RKmuiTy9YFrMuqut0kESmCCzf5tYV4C2XbyEs=
MIME-Version: 1.0
Received: by 10.142.121.41 with SMTP id t41mr148563wfc.358.1305959999077; Fri,
	20 May 2011 23:39:59 -0700 (PDT)
Received: by 10.142.157.2 with HTTP; Fri, 20 May 2011 23:39:59 -0700 (PDT)
Date: Sat, 21 May 2011 02:39:59 -0400
Message-ID: <BANLkTimRA==j01=AtdDaRMNk5SOU8So5uw@mail.gmail.com>
From: grarpamp <grarpamp@gmail.com>
To: freebsd-fs@freebsd.org
Content-Type: text/plain; charset=UTF-8
Subject: Write reallocator
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 21 May 2011 06:39:59 -0000

I've got a disk that I'd like to excercise in order to
see if it will reallocate marginal reads when
written to. Normally I'd just zero the thing,
destroy and toss it. But I feel like playing more.
Because the data is still semi valuable, I want to
read and write back every block of the disk.
Any tools that will do this?
Besides dd and shell math?

Also, as with SCSI drives and camcontrol,
is there a decent ATA mode page editor
out there? Even if on Windows.

Maybe this is more for the hardware list?

From owner-freebsd-fs@FreeBSD.ORG  Sat May 21 07:32:10 2011
Return-Path: <owner-freebsd-fs@FreeBSD.ORG>
Delivered-To: freebsd-fs@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34])
	by hub.freebsd.org (Postfix) with ESMTP id DBE27106566C
	for <freebsd-fs@freebsd.org>; Sat, 21 May 2011 07:32:10 +0000 (UTC)
	(envelope-from phk@critter.freebsd.dk)
Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222])
	by mx1.freebsd.org (Postfix) with ESMTP id 9D74F8FC15
	for <freebsd-fs@freebsd.org>; Sat, 21 May 2011 07:32:10 +0000 (UTC)
Received: from critter.freebsd.dk (critter-phk.freebsd.dk [192.168.48.2])
	by phk.freebsd.dk (Postfix) with ESMTP id 7264F5E10;
	Sat, 21 May 2011 07:13:35 +0000 (UTC)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.14.4/8.14.4) with ESMTP id p4L7DZCf006807;
	Sat, 21 May 2011 07:13:35 GMT (envelope-from phk@critter.freebsd.dk)
To: grarpamp <grarpamp@gmail.com>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Sat, 21 May 2011 02:39:59 -0400."
	<BANLkTimRA==j01=AtdDaRMNk5SOU8So5uw@mail.gmail.com> 
Content-Type: text/plain; charset=ISO-8859-1
Date: Sat, 21 May 2011 07:13:35 +0000
Message-ID: <6806.1305962015@critter.freebsd.dk>
Sender: phk@critter.freebsd.dk
Cc: freebsd-fs@freebsd.org
Subject: Re: Write reallocator 
X-BeenThere: freebsd-fs@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Filesystems <freebsd-fs.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-fs>
List-Post: <mailto:freebsd-fs@freebsd.org>
List-Help: <mailto:freebsd-fs-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-fs>,
	<mailto:freebsd-fs-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 21 May 2011 07:32:10 -0000

In message <BANLkTimRA==j01=AtdDaRMNk5SOU8So5uw@mail.gmail.com>, grarpamp write
s:
>I've got a disk that I'd like to excercise in order to
>see if it will reallocate marginal reads when
>written to. Normally I'd just zero the thing,
>destroy and toss it. But I feel like playing more.
>Because the data is still semi valuable, I want to
>read and write back every block of the disk.
>Any tools that will do this?

Recoverdisk ?

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.