From owner-freebsd-fs@FreeBSD.ORG Sun Jul 5 17:07:29 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9CFB1106566C for ; Sun, 5 Jul 2009 17:07:29 +0000 (UTC) (envelope-from gldisater@gmail.com) Received: from mail-qy0-f204.google.com (mail-qy0-f204.google.com [209.85.221.204]) by mx1.freebsd.org (Postfix) with ESMTP id 4714C8FC0A for ; Sun, 5 Jul 2009 17:07:29 +0000 (UTC) (envelope-from gldisater@gmail.com) Received: by qyk42 with SMTP id 42so268431qyk.3 for ; Sun, 05 Jul 2009 10:07:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:cc:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=9Ut5x2Dn2Xz7BTV3oTAl923wFiyGZFZ2AtFCwGSVJ8A=; b=KbOdCNOmQdkUFiX2L2vRnp6nJqerec3sP0RTmUl9HzVOL+hEPJ0/HPBedKEkL6OAzR 5vv6nuIi6FsfNKBZy3Pccfd7ZTVzEdT+hFJhMPd1A3HGiI9oUK0ZBR80U9bnBnCLBn6g 1I8xYHSEBdYM7Al6nLlMdzO8r7BtLKVTVDymU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; b=IAHvz/Ie2BmilAG9md+hFc1CAAMDAOjlz47tjQ0/1ZQVtDnbED6qIZ4WSP7DMG77FU KElK0UHb+k02EHhTXG8ouCILQXHo6OvPgHoSX9H4zsqLj8g8PkA+91lMBHN/TC01391N Ejd4YTii/GxUPpVZ4xeutDWgFlmBcn5Hpe5Ng= Received: by 10.224.19.133 with SMTP id a5mr3813013qab.387.1246812261519; Sun, 05 Jul 2009 09:44:21 -0700 (PDT) Received: from ?192.168.1.3? (CPE0013100d8fd9-CM00195eca698c.cpe.net.cable.rogers.com [99.237.60.47]) by mx.google.com with ESMTPS id 26sm12194158qwa.37.2009.07.05.09.44.20 (version=TLSv1/SSLv3 cipher=RC4-MD5); Sun, 05 Jul 2009 09:44:20 -0700 (PDT) Message-ID: <4A50A02B.5060402@gmail.com> Date: Sun, 05 Jul 2009 12:44:27 +0000 From: Jeremy Faulkner User-Agent: Thunderbird 2.0.0.21 (X11/20090518) MIME-Version: 1.0 To: Dan Naumov References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org Subject: Re: ZFS and df weirdness X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 05 Jul 2009 17:07:29 -0000 Dan Naumov wrote: > Hello list. > > I have a single 2tb disk used on a 7.2-release/amd64 system with a > small part of it given to UFS and most of the disk given to a single > "simple" zfs pool with several filesystems without redundancy. I've > noticed a really weird thing regarding what "df" reports regarding the > "total space" of one of my filesystems: > > atom# zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > tank 1.80T 294G 1.51T 15% ONLINE - > > atom# zfs list > NAME USED AVAIL REFER MOUNTPOINT > tank 294G 1.48T 18K none > tank/DATA 292G 1.48T 292G /DATA > tank/home 216K 1.48T 21K /home > tank/home/jago 132K 1.48T 132K /home/jago > tank/home/karni 62K 1.48T 62K /home/karni > tank/usr 1.33G 1.48T 18K none > tank/usr/local 455M 1.48T 455M /usr/local > tank/usr/obj 18K 1.48T 18K /usr/obj > tank/usr/ports 412M 1.48T 412M /usr/ports > tank/usr/src 495M 1.48T 495M /usr/src > tank/var 320K 1.48T 18K none > tank/var/log 302K 1.48T 302K /var/log > > atom# df > Filesystem 1K-blocks Used Avail Capacity Mounted on > /dev/ad12s1a 16244334 1032310 13912478 7% / > devfs 1 1 0 100% /dev > linprocfs 4 4 0 100% /usr/compat/linux/proc > tank/DATA 1897835904 306397056 1591438848 16% /DATA > tank/home 1591438848 0 1591438848 0% /home > tank/home/jago 1591438976 128 1591438848 0% /home/jago > tank/home/karni 1591438848 0 1591438848 0% /home/karni > tank/usr/local 1591905024 466176 1591438848 0% /usr/local > tank/usr/obj 1591438848 0 1591438848 0% /usr/obj > tank/usr/ports 1591860864 422016 1591438848 0% /usr/ports > tank/usr/src 1591945600 506752 1591438848 0% /usr/src > tank/var/log 1591439104 256 1591438848 0% /var/log > > atom# df -h > Filesystem Size Used Avail Capacity Mounted on > /dev/ad12s1a 15G 1.0G 13G 7% / > devfs 1.0K 1.0K 0B 100% /dev > linprocfs 4.0K 4.0K 0B 100% /usr/compat/linux/proc > tank/DATA 1.8T 292G 1.5T 16% /DATA > tank/home 1.5T 0B 1.5T 0% /home > tank/home/jago 1.5T 128K 1.5T 0% /home/jago > tank/home/karni 1.5T 0B 1.5T 0% /home/karni > tank/usr/local 1.5T 455M 1.5T 0% /usr/local > tank/usr/obj 1.5T 0B 1.5T 0% /usr/obj > tank/usr/ports 1.5T 412M 1.5T 0% /usr/ports > tank/usr/src 1.5T 495M 1.5T 0% /usr/src > tank/var/log 1.5T 256K 1.5T 0% /var/log > > Considering that every single filesystem is part of the exact same > pool, with no custom options whatsoever used during filesystem > creation (except for mountpoints), why is the size of tank/DATA 1.8T > while the others are 1.5T? > > > - Sincerely, > Dan Naumov Because 292G is already written to it with another 1.5T available for use. All your other filesystems are less than 0.5G so they don't impact the rounding o fthe Size field. -- Jeremy Faulkner From owner-freebsd-fs@FreeBSD.ORG Mon Jul 6 11:06:57 2009 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 5FBC41065672 for ; Mon, 6 Jul 2009 11:06:57 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 4BB618FC15 for ; Mon, 6 Jul 2009 11:06:57 +0000 (UTC) (envelope-from owner-bugmaster@FreeBSD.org) Received: from freefall.freebsd.org (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n66B6vra010750 for ; Mon, 6 Jul 2009 11:06:57 GMT (envelope-from owner-bugmaster@FreeBSD.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n66B6uRG010746 for freebsd-fs@FreeBSD.org; Mon, 6 Jul 2009 11:06:56 GMT (envelope-from owner-bugmaster@FreeBSD.org) Date: Mon, 6 Jul 2009 11:06:56 GMT Message-Id: <200907061106.n66B6uRG010746@freefall.freebsd.org> X-Authentication-Warning: freefall.freebsd.org: gnats set sender to owner-bugmaster@FreeBSD.org using -f From: FreeBSD bugmaster To: freebsd-fs@FreeBSD.org Cc: Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jul 2009 11:06:57 -0000 Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels f kern/133020 fs [zfs] [panic] inappropriate panic caused by zfs. Pani o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132337 fs [zfs] [panic] kernel panic in zfs_fuid_create_cred o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 143 problems total. From owner-freebsd-fs@FreeBSD.ORG Mon Jul 6 18:21:06 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E907C106564A for ; Mon, 6 Jul 2009 18:21:06 +0000 (UTC) (envelope-from denismacpherson@gmail.com) Received: from qw-out-2122.google.com (qw-out-2122.google.com [74.125.92.26]) by mx1.freebsd.org (Postfix) with ESMTP id 9BAEC8FC1D for ; Mon, 6 Jul 2009 18:21:06 +0000 (UTC) (envelope-from denismacpherson@gmail.com) Received: by qw-out-2122.google.com with SMTP id 5so1584690qwd.7 for ; Mon, 06 Jul 2009 11:21:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:date:message-id :subject:from:to:content-type:content-transfer-encoding; bh=TK8sJw0B6oB3kE3cMq5MWfYlowWEoRGCf3Yg4+sofM0=; b=UbR84AKaw3f4RTX7C91RaWlnivI6cCnH2JKuvLTiFuDWfPmrQJ0wPa6EcLZqPlzcGC /hah+LHJ7UZMMf1r82PblrzzkYBmDq3lhjlhH8k0e1TeQBI78XGdaEX5cFtosVyfY43l lcje25pUH/I5CI2tc8/yk2UUExoDFPSql5cg4= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=O6fUaUvFcKlvlRw4Ye65mCHTN04W7N+MvGo54DeU5j+3QJxFZBPMgtraIuUrcVZkj/ MxoVTv7Og+9YHlaZMeJ0oNuuPHj4NUWoUCo2/IZ7aaPVIWRBGdlvc+GlMzcQ7LMS/6HZ dlkpxwfi40JTlQfGcuj2pPEvZ6W6VU5w0KMLo= MIME-Version: 1.0 Received: by 10.229.86.145 with SMTP id s17mr2438841qcl.10.1246902997522; Mon, 06 Jul 2009 10:56:37 -0700 (PDT) Date: Mon, 6 Jul 2009 10:56:37 -0700 Message-ID: <7c82f7860907061056gb7b38e4s52b15ab0e664ffc6@mail.gmail.com> From: Denis MacPherson To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: ZFS invalid vdev configuration X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: denis.macpherson@gmail.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 06 Jul 2009 18:21:07 -0000 Having a issue with a raidz1 setup, I was testing some things out on a live cd and made a change in the bios for the sata to change it from AHCI to IDE, then before I booted back into FreeBSD I had changed this back so FreeBSD never saw any of these changes. When I booted back up, I was getting insufficient replicas corrupted data message. I tried to export the pool as I found some info on this helping some people but now I am unable to re-import the pool back. Below I have attached some of the outputs of various commands. Any help would be much appreciated. -------------------------------------------------------------------------------------------------------------------------------------- command: zpool import pool: datapool id: 5998882629718828483 state: FAULTED action: The pool cannot be imported due to damaged devices or data. config: datapool UNAVAIL insufficient replicas raidz1 UNAVAIL corrupted data ad14 ONLINE ad12 ONLINE ad10 ONLINE ad8 ONLINE ad16 ONLINE ad18 ONLINE command: zpool import datapool cannot import 'datapool': invalid vdev configuration command: dmesg ZFS filesystem version 6 ZFS storage pool version 6 acd0: DVDR at ata0-slave UDMA33 ad4: 152627MB at ata2-master SATA150 ad6: 953868MB at ata3-master SATA300 ad8: 953869MB at ata4-master SATA300 GEOM_LABEL: Label for provider ad4p1 is msdosfs/EFI. GEOM_LABEL: Label for provider ad4s1a is ufsid/493ee78d1bd00753. ad10: 953869MB at ata5-master SATA300 GEOM_LABEL: Label for provider ad6s1 is ext2fs/1.39-Aug092008. ad12: 953869MB at ata6-master SATA300 ad14: 953869MB at ata7-master SATA300 ad16: 953868MB at ata8-master SATA300 ad18: 953869MB at ata9-master SATA300 Trying to mount root from zfs:tank/root GEOM_LABEL: Label msdosfs/EFI removed. GEOM_LABEL: Label ufsid/493ee78d1bd00753 removed. GEOM_LABEL: Label for provider ad4s1a is ufsid/493ee78d1bd00753. GEOM_LABEL: Label ufsid/493ee78d1bd00753 removed. GEOM_LABEL: Label ext2fs/1.39-Aug092008 removed. command: zdb -l /dev/ad8 -------------------------------------------- LABEL 0 -------------------------------------------- version=6 name='datapool' state=0 txg=2498378 pool_guid=5998882629718828483 hostid=2846502798 hostname='' top_guid=2074816204479013297 guid=8852512481608149738 vdev_tree type='raidz' id=0 guid=2074816204479013297 nparity=1 metaslab_array=14 metaslab_shift=35 ashift=9 asize=6001199677440 children[0] type='disk' id=0 guid=11429030338875577754 path='/dev/ad14' devid='ad:WD-WCAU46148536' whole_disk=0 DTL=23 children[1] type='disk' id=1 guid=15556804307381672763 path='/dev/ad12' devid='ad:WD-WCAU44223223' whole_disk=0 DTL=22 children[2] type='disk' id=2 guid=11466465050580886753 path='/dev/ad10' devid='ad:WD-WCAU44223259' whole_disk=0 DTL=21 children[3] type='disk' id=3 guid=8852512481608149738 path='/dev/ad8' devid='ad:WD-WCAU44026431' whole_disk=0 DTL=20 children[4] type='disk' id=4 guid=12199743530994947046 path='/dev/ad16' devid='ad:WD-WCAU46152226' whole_disk=0 DTL=19 children[5] type='disk' id=5 guid=9333427139064108901 path='/dev/ad18' devid='ad:WD-WCAU46165741' whole_disk=0 DTL=18 -------------------------------------------- LABEL 1 -------------------------------------------- version=6 name='datapool' state=0 txg=2498378 pool_guid=5998882629718828483 hostid=2846502798 hostname='' top_guid=2074816204479013297 guid=8852512481608149738 vdev_tree type='raidz' id=0 guid=2074816204479013297 nparity=1 metaslab_array=14 metaslab_shift=35 ashift=9 asize=6001199677440 children[0] type='disk' id=0 guid=11429030338875577754 path='/dev/ad14' devid='ad:WD-WCAU46148536' whole_disk=0 DTL=23 children[1] type='disk' id=1 guid=15556804307381672763 path='/dev/ad12' devid='ad:WD-WCAU44223223' whole_disk=0 DTL=22 children[2] type='disk' id=2 guid=11466465050580886753 path='/dev/ad10' devid='ad:WD-WCAU44223259' whole_disk=0 DTL=21 children[3] type='disk' id=3 guid=8852512481608149738 path='/dev/ad8' devid='ad:WD-WCAU44026431' whole_disk=0 DTL=20 children[4] type='disk' id=4 guid=12199743530994947046 path='/dev/ad16' devid='ad:WD-WCAU46152226' whole_disk=0 DTL=19 children[5] type='disk' id=5 guid=9333427139064108901 path='/dev/ad18' devid='ad:WD-WCAU46165741' whole_disk=0 DTL=18 -------------------------------------------- LABEL 2 -------------------------------------------- version=6 name='datapool' state=0 txg=2498378 pool_guid=5998882629718828483 hostid=2846502798 hostname='' top_guid=2074816204479013297 guid=8852512481608149738 vdev_tree type='raidz' id=0 guid=2074816204479013297 nparity=1 metaslab_array=14 metaslab_shift=35 ashift=9 asize=6001199677440 children[0] type='disk' id=0 guid=11429030338875577754 path='/dev/ad14' devid='ad:WD-WCAU46148536' whole_disk=0 DTL=23 children[1] type='disk' id=1 guid=15556804307381672763 path='/dev/ad12' devid='ad:WD-WCAU44223223' whole_disk=0 DTL=22 children[2] type='disk' id=2 guid=11466465050580886753 path='/dev/ad10' devid='ad:WD-WCAU44223259' whole_disk=0 DTL=21 children[3] type='disk' id=3 guid=8852512481608149738 path='/dev/ad8' devid='ad:WD-WCAU44026431' whole_disk=0 DTL=20 children[4] type='disk' id=4 guid=12199743530994947046 path='/dev/ad16' devid='ad:WD-WCAU46152226' whole_disk=0 DTL=19 children[5] type='disk' id=5 guid=9333427139064108901 path='/dev/ad18' devid='ad:WD-WCAU46165741' whole_disk=0 DTL=18 -------------------------------------------- LABEL 3 -------------------------------------------- version=6 name='datapool' state=0 txg=2498378 pool_guid=5998882629718828483 hostid=2846502798 hostname='' top_guid=2074816204479013297 guid=8852512481608149738 vdev_tree type='raidz' id=0 guid=2074816204479013297 nparity=1 metaslab_array=14 metaslab_shift=35 ashift=9 asize=6001199677440 children[0] type='disk' id=0 guid=11429030338875577754 path='/dev/ad14' devid='ad:WD-WCAU46148536' whole_disk=0 DTL=23 children[1] type='disk' id=1 guid=15556804307381672763 path='/dev/ad12' devid='ad:WD-WCAU44223223' whole_disk=0 DTL=22 children[2] type='disk' id=2 guid=11466465050580886753 path='/dev/ad10' devid='ad:WD-WCAU44223259' whole_disk=0 DTL=21 children[3] type='disk' id=3 guid=8852512481608149738 path='/dev/ad8' devid='ad:WD-WCAU44026431' whole_disk=0 DTL=20 children[4] type='disk' id=4 guid=12199743530994947046 path='/dev/ad16' devid='ad:WD-WCAU46152226' whole_disk=0 DTL=19 children[5] type='disk' id=5 guid=9333427139064108901 path='/dev/ad18' devid='ad:WD-WCAU46165741' whole_disk=0 DTL=18 From owner-freebsd-fs@FreeBSD.ORG Tue Jul 7 00:45:40 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B69E4106568A for ; Tue, 7 Jul 2009 00:45:40 +0000 (UTC) (envelope-from dan.naumov@gmail.com) Received: from mail-yx0-f181.google.com (mail-yx0-f181.google.com [209.85.210.181]) by mx1.freebsd.org (Postfix) with ESMTP id 7730F8FC17 for ; Tue, 7 Jul 2009 00:45:40 +0000 (UTC) (envelope-from dan.naumov@gmail.com) Received: by yxe11 with SMTP id 11so6381423yxe.3 for ; Mon, 06 Jul 2009 17:45:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:date:message-id:subject :from:to:content-type:content-transfer-encoding; bh=AaZoITLTGOEHa40hsB5cw+8lovPRcvNH4P15iyX2eHQ=; b=XgJ1JjstavEg+xJmevKLWB9wayHgl/vC3VtoHq1hNzjC7YXITqX2xf7MhF61z29eKl kuLcqFzipIAMenHGND7wkXPxLK+rnrjSBfNr8k7SnZd0PNcX4Euqp7taz60LbJN/OpBv L1nHxaHcME2D2upP2JUA/GioLmdkLGaN9I2ks= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; b=nMPTBhrsHw5dd1s8IVtS/QCb6+dQUCCBNLe58K0j067RqXLasuSXKMh45eMY3C4SmJ 7E9Jddd/OG7ZGyJmRY6L0Tx/dMbfRnqxspEC2p3WO8+ElwhdoiciuBgrn62UqvC83wxx oC6VbMaBgxNyNCMpcx5tUmBnBz8y4lxB7ZjyU= MIME-Version: 1.0 Received: by 10.100.108.8 with SMTP id g8mr9596857anc.66.1246927539731; Mon, 06 Jul 2009 17:45:39 -0700 (PDT) Date: Tue, 7 Jul 2009 03:45:39 +0300 Message-ID: From: Dan Naumov To: freebsd-fs@freebsd.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: ZFS: swap on a ZVOL X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jul 2009 00:45:41 -0000 Hello list. As far as I know, using swap on top of a "non-trivial" filesystem like ZFS is considered "unsupported", but it does in fact work. You can create a ZVOL of arbitrary size (say, 4G) and then do the following: zfs set org.freebsd:swap=on pool/swapvolname to have /etc/rc.d/zfs enable swap on said ZVOL on every boot. You can also do this in an ugly way: put swapon /dev/zvol// into your /etc/rc.local (without having to pass the "org.freebsd:swap=on" option to the ZVOL). Now the question remains, what kind of issues are expected to arise when using swap on a ZVOL and is there any work going to in order to resolve them? One of the issues mentioned is that ZVOL swap cannot handle kernel dumps and another, more serious potential issue is a race condition where "more swap is needed to swap". Assuming I have a machine with 2gb ram, if I use a 4gb ZVOL swap, am I likely to run into any serious issues assuming that under normal operation, the system uses from none to very little swap? - Sincerely, Dan Naumov From owner-freebsd-fs@FreeBSD.ORG Tue Jul 7 07:21:55 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 53C381065672 for ; Tue, 7 Jul 2009 07:21:55 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: from web37307.mail.mud.yahoo.com (web37307.mail.mud.yahoo.com [209.191.90.250]) by mx1.freebsd.org (Postfix) with SMTP id 135458FC16 for ; Tue, 7 Jul 2009 07:21:54 +0000 (UTC) (envelope-from numisemis@yahoo.com) Received: (qmail 91791 invoked by uid 60001); 7 Jul 2009 07:21:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1246951314; bh=I0MGwuVPDeCUYSdN0nzGFoe6NQcBFE0Hm+l05M4tdQY=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding; b=ZyxCbyYaWVw9H8N+/Q2oZiU65aj6wmJFHD4iGfY4B5ogm8w7ZgOQr92IfwUYv7CnhzB7GKZV4s4hIKTyOp7wGOqHU7hksL15Mh6n9ICQ1KJH0d7kTQlYJiuHnP+ccQ1PFN/0wVYfsksNUhHwf+XU6ILhE/wvHFXQPkS5Zw6rW9I= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:X-YMail-OSG:Received:X-Mailer:Date:From:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding; b=ynRq+lP4g4bfac3vR68hUpsY2MysK/v2i+Jk3SfCpE+Kf017loh9hQ0bME0IZlzmm/+RxaCFpkt26RSjtt+A0qU9Ta3VzfzaoySzNYkoPIgOwOb33EoVIlTmAhTcGLT4S2BJO4bmR/Juo9ulkkQ3lwM59pginZ94IefgRSBHTAY=; Message-ID: <354003.91539.qm@web37307.mail.mud.yahoo.com> X-YMail-OSG: mbmqcT4VM1m0tbN8fUDWwBJC.gAga9nb4W6awjPdv7EUKzE2EwX.OWXcNwrNHMT659Du4gtiIFbl3ToHVuYxOz3gpcElJ2wBYBWknitIEfiWHb3rFnmli8bupIraY03HF7GXXdLlVokqlS__8GT2hKWYFgOsFVm2BT6oPpf.CVD89P1Cerk6kLEluqHrhjfa58hFEXuVKIL7l6mK667YbxLKKAH5qEX2U41kdDoYVwE7Z_9s7ZMK7OO3AAyyzvmvUT7ttEnubyB36UQNzoA11M3u5kV.yF6CX.VgPCqHLnRXbBcls8jytJOR Received: from [87.248.121.245] by web37307.mail.mud.yahoo.com via HTTP; Tue, 07 Jul 2009 00:21:54 PDT X-Mailer: YahooMailWebService/0.7.289.15 Date: Tue, 7 Jul 2009 00:21:54 -0700 (PDT) From: =?utf-8?B?xaBpbXVuIE1pa2VjaW4=?= To: Dan Naumov MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: "freebsd-fs@freebsd.org" Subject: Re: ZFS: swap on a ZVOL X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jul 2009 07:21:55 -0000 Dan Naumov wrote:=0AAs far as I know, using swap on top of a "non-trivial" = filesystem like=0AZFS is considered "unsupported", but it does in fact work= . You can=0Acreate a ZVOL of arbitrary size (say, 4G) and then do the follo= wing:=0Azfs set org.freebsd:swap=3Don pool/swapvolname=0Ato have /etc/rc.d/= zfs enable swap on said ZVOL on every boot. You can=0Aalso do this in an ug= ly way: put swapon=0A/dev/zvol// into your /etc/rc.l= ocal (without=0Ahaving to pass the "org.freebsd:swap=3Don" option to the ZV= OL).=0A=0ANow the question remains, what kind of issues are expected to ari= se=0Awhen using swap on a ZVOL and is there any work going to in order to= =0Aresolve them? One of the issues mentioned is that ZVOL swap cannot=0Ahan= dle kernel dumps and another, more serious potential issue is a=0Arace cond= ition where "more swap is needed to swap". Assuming I have a=0Amachine with= 2gb ram, if I use a 4gb ZVOL swap, am I likely to run=0Ainto any serious i= ssues assuming that under normal operation, the=0Asystem uses from none to = very little swap?=0A=0AAFAIK, it was said that race condition you mentioned= also exists in OpenSolaris (that was back in the ZFS v6 days).=0ABut, AFAI= K new versions of OpenSolaris do use swap on ZFS volume as by default (corr= ect me if I'm wrong).=0ASomebody more knowledgeable should answer this, but= this made me thinking that maybe, just maybe that race condition was solve= d in some ZFS version >v6.=0A=0A=0A=0A=0A From owner-freebsd-fs@FreeBSD.ORG Tue Jul 7 08:00:04 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id ED9311065673 for ; Tue, 7 Jul 2009 08:00:04 +0000 (UTC) (envelope-from Frank.Batschulat@Sun.COM) Received: from gmp-eb-inf-2.sun.com (gmp-eb-inf-2.sun.com [192.18.6.24]) by mx1.freebsd.org (Postfix) with ESMTP id 51AFD8FC08 for ; Tue, 7 Jul 2009 08:00:04 +0000 (UTC) (envelope-from Frank.Batschulat@Sun.COM) Received: from fe-emea-10.sun.com (gmp-eb-lb-1-fe3.eu.sun.com [192.18.6.10]) by gmp-eb-inf-2.sun.com (8.13.7+Sun/8.12.9) with ESMTP id n677xp8O000219 for ; Tue, 7 Jul 2009 08:00:03 GMT MIME-version: 1.0 Content-transfer-encoding: 8BIT Content-type: text/plain; charset=utf-8 Received: from conversion-daemon.fe-emea-10.sun.com by fe-emea-10.sun.com (Sun Java(tm) System Messaging Server 7u2-7.02 64bit (built Apr 16 2009)) id <0KME00700JDV3M00@fe-emea-10.sun.com> for freebsd-fs@freebsd.org; Tue, 07 Jul 2009 08:59:51 +0100 (BST) Received: from opteron ([unknown] [84.188.230.62]) by fe-emea-10.sun.com (Sun Java(tm) System Messaging Server 7u2-7.02 64bit (built Apr 16 2009)) with ESMTPSA id <0KME00DTZJJQRPB0@fe-emea-10.sun.com>; Tue, 07 Jul 2009 08:59:51 +0100 (BST) Date: Tue, 07 Jul 2009 09:54:50 +0200 From: "Frank Batschulat (Home)" In-reply-to: <354003.91539.qm@web37307.mail.mud.yahoo.com> Sender: Frank.Batschulat@Sun.COM To: =?utf-8?B?xaBpbXVuIE1pa2VjaW4=?= , Dan Naumov Message-id: Organization: SUN Microsystems References: <354003.91539.qm@web37307.mail.mud.yahoo.com> User-Agent: Opera Mail/9.64 (SunOS) Cc: "freebsd-fs@freebsd.org" Subject: Re: ZFS: swap on a ZVOL X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 07 Jul 2009 08:00:05 -0000 On Tue, 07 Jul 2009 09:21:54 +0200, Šimun Mikecin wrote: > > Dan Naumov wrote: > As far as I know, using swap on top of a "non-trivial" filesystem like > ZFS is considered "unsupported", but it does in fact work. You can > create a ZVOL of arbitrary size (say, 4G) and then do the following: > zfs set org.freebsd:swap=on pool/swapvolname > to have /etc/rc.d/zfs enable swap on said ZVOL on every boot. You can > also do this in an ugly way: put swapon > /dev/zvol// into your /etc/rc.local (without > having to pass the "org.freebsd:swap=on" option to the ZVOL). > > Now the question remains, what kind of issues are expected to arise > when using swap on a ZVOL and is there any work going to in order to > resolve them? One of the issues mentioned is that ZVOL swap cannot > handle kernel dumps and another, more serious potential issue is a > race condition where "more swap is needed to swap". Assuming I have a > machine with 2gb ram, if I use a 4gb ZVOL swap, am I likely to run > into any serious issues assuming that under normal operation, the > system uses from none to very little swap? > > AFAIK, it was said that race condition you mentioned also exists in OpenSolaris (that was back in the ZFS v6 days). > But, AFAIK new versions of OpenSolaris do use swap on ZFS volume as by default (correct me if I'm wrong). > Somebody more knowledgeable should answer this, but this made me thinking that maybe, just maybe that race condition was solved in some ZFS version >v6. Right, we do use a zvol as swap device, and we also do use a dedicated zvol as the dump device for the kernel crash dump. http://www.opensolaris.org/os/community/zfs/boot/zfsbootFAQ/#zfsswap http://www.solarisinternals.com/wiki/index.php/ZFS_Troubleshooting_Guide#ZFS_Swap_and_Dump_Devices http://docs.sun.com/app/docs/doc/819-5461/zfsboot-1?a=view http://www.opensolaris.org/os/community/zfs/boot/zfsboottalk.0910.pdf http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=5008936 was part of the ZFS boot project: http://www.opensolaris.org/os/community/zfs/boot/ afai remember neither the dump nor swap part itself required changes to the ondisk format, zpool version 6 just covered storing some new pool properties like 'bootfs' -> http://www.opensolaris.org/os/community/zfs/version/6/ hth frankB From owner-freebsd-fs@FreeBSD.ORG Wed Jul 8 19:05:01 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D9B261065672; Wed, 8 Jul 2009 19:05:01 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id B07488FC13; Wed, 8 Jul 2009 19:05:01 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n68J51bc098118; Wed, 8 Jul 2009 19:05:01 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n68J5167098114; Wed, 8 Jul 2009 19:05:01 GMT (envelope-from linimon) Date: Wed, 8 Jul 2009 19:05:01 GMT Message-Id: <200907081905.n68J5167098114@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/136470: [nfs] Cannot mount / in read-only, over NFS X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jul 2009 19:05:02 -0000 Old Synopsis: Cannot mount / in read-only, over NFS New Synopsis: [nfs] Cannot mount / in read-only, over NFS Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Jul 8 19:04:49 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=136470 From owner-freebsd-fs@FreeBSD.ORG Wed Jul 8 21:48:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 81B1F106564A for ; Wed, 8 Jul 2009 21:48:48 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from mail.jrv.org (adsl-70-243-84-13.dsl.austtx.swbell.net [70.243.84.13]) by mx1.freebsd.org (Postfix) with ESMTP id 48C9C8FC0A for ; Wed, 8 Jul 2009 21:48:47 +0000 (UTC) (envelope-from james-freebsd-fs2@jrv.org) Received: from kremvax.housenet.jrv (kremvax.housenet.jrv [192.168.3.124]) by mail.jrv.org (8.14.3/8.14.3) with ESMTP id n68LmkQx034735 for ; Wed, 8 Jul 2009 16:48:47 -0500 (CDT) (envelope-from james-freebsd-fs2@jrv.org) Authentication-Results: mail.jrv.org; domainkeys=pass (testing) header.from=james-freebsd-fs2@jrv.org DomainKey-Signature: a=rsa-sha1; s=enigma; d=jrv.org; c=nofws; q=dns; h=message-id:date:from:user-agent:mime-version:to:subject: content-type:content-transfer-encoding; b=pP3snWh+00uHUbuc6aWcWTERwROIGiSMhhJ4ezgSCNBVGtQN4UpubebTP5tngVIAt 3heGPhYkHgE3U+Wwl3rq8LIASdIMWfQDWf2bX43+azcVWu3MOxOZ3t5S/ci18KSyB9N tilfAH/oBGfb2hQST6Fa4OTVkIdFHiEmdkSXEbY= Message-ID: <4A55142B.8020800@jrv.org> Date: Wed, 08 Jul 2009 16:48:27 -0500 From: "James R. Van Artsdalen" User-Agent: Thunderbird 2.0.0.22 (Macintosh/20090605) MIME-Version: 1.0 To: freebsd-fs Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Subject: ZFS: File name too long X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jul 2009 21:48:48 -0000 svn rev 192136 (-CURRENT on May 14, 2009) /Volumes/incoming/Personal# ls -l '.zfs/snapshot/syssnap-1247029203.2009-07-08.00:00:03.189-27-3' ls: .zfs/snapshot/syssnap-1247029203.2009-07-08.00:00:03.189-27-3: File name too long /Volumes/incoming/Personal# /Volumes/incoming/Personal# ls -l '/root/.zfs/snapshot/syssnap-1247029203.2009-07-08.00:00:03.189-27-3'/.login -rw-r--r-- 1 root wheel 290 Feb 27 07:42 /root/.zfs/snapshot/syssnap-1247029203.2009-07-08.00:00:03.189-27-3/.login /Volumes/incoming/Personal# From owner-freebsd-fs@FreeBSD.ORG Thu Jul 9 09:02:25 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 7FA5A106566C; Thu, 9 Jul 2009 09:02:25 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id EF22C8FC13; Thu, 9 Jul 2009 09:02:24 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:48033 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MOpVr-0001TP-5L; Thu, 09 Jul 2009 11:01:57 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id D98393F629; Thu, 9 Jul 2009 11:01:52 +0200 (CEST) Message-Id: <766FFF07-181A-4180-B020-AA3EE46CF6F8@exscape.org> From: Thomas Backman To: FreeBSD current In-Reply-To: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 9 Jul 2009 11:01:51 +0200 References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MOpVr-0001TP-5L. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MOpVr-0001TP-5L e2f9b7d864960823e715427db0b86fde Cc: freebsd-fs@freebsd.org Subject: Re: "New" ZFS crash on FS (pool?) unmount/export X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2009 09:02:26 -0000 On Jun 20, 2009, at 09:11, Thomas Backman wrote: > I just ran into this tonight. Not sure exactly what triggered it - > the box stopped responding to pings at 02:07AM and it has a cron > backup job using zfs send/recv at 02:00, so I'm guessing it's > related, even though the backup probably should have finished before > then... Hmm. Anyway. > > r194478. > > kernel trap 12 with interrupts disabled > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x288 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff805a4989 > stack pointer = 0x28:0xffffff803e8b57e0 > frame pointer = 0x28:0xffffff803e8b5840 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 57514 (zpool) > panic: from debugger > cpuid = 0 > Uptime: 10h22m13s > Physical memory: 2027 MB > > (kgdb) bt > #0 doadump () at pcpu.h:223 > #1 0xffffffff8059c409 in boot (howto=260) at /usr/src/sys/kern/ > kern_shutdown.c:419 > #2 0xffffffff8059c85c in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:575 > #3 0xffffffff801f1377 in db_panic (addr=Variable "addr" is not > available. > ) at /usr/src/sys/ddb/db_command.c:478 > #4 0xffffffff801f1781 in db_command (last_cmdp=0xffffffff80c38620, > cmd_table=Variable "cmd_table" is not available. > ) at /usr/src/sys/ddb/db_command.c:445 > #5 0xffffffff801f19d0 in db_command_loop () at /usr/src/sys/ddb/ > db_command.c:498 > #6 0xffffffff801f3969 in db_trap (type=Variable "type" is not > available. > ) at /usr/src/sys/ddb/db_main.c:229 > #7 0xffffffff805ce465 in kdb_trap (type=12, code=0, > tf=0xffffff803e8b5730) at /usr/src/sys/kern/subr_kdb.c:534 > #8 0xffffffff8088715d in trap_fatal (frame=0xffffff803e8b5730, > eva=Variable "eva" is not available. > ) at /usr/src/sys/amd64/amd64/trap.c:847 > #9 0xffffffff80887fb2 in trap (frame=0xffffff803e8b5730) at /usr/ > src/sys/amd64/amd64/trap.c:345 > #10 0xffffffff8086e007 in calltrap () at /usr/src/sys/amd64/amd64/ > exception.S:223 > #11 0xffffffff805a4989 in _sx_xlock_hard (sx=0xffffff0043557d50, > tid=18446742975830720512, opts=Variable "opts" is not available. > ) > at /usr/src/sys/kern/kern_sx.c:575 > #12 0xffffffff805a52fe in _sx_xlock (sx=Variable "sx" is not > available. > ) at sx.h:155 > #13 0xffffffff80fe2995 in zfs_freebsd_reclaim () from /boot/kernel/ > zfs.ko > #14 0xffffffff808cefca in VOP_RECLAIM_APV (vop=0xffffff0043557d38, > a=0xffffff0043557d50) at vnode_if.c:1926 > #15 0xffffffff80626f6e in vgonel (vp=0xffffff00437a7938) at > vnode_if.h:830 > #16 0xffffffff8062b528 in vflush (mp=0xffffff0060f2a000, rootrefs=0, > flags=0, td=0xffffff0061528000) > at /usr/src/sys/kern/vfs_subr.c:2450 > #17 0xffffffff80fdd3a8 in zfs_umount () from /boot/kernel/zfs.ko > #18 0xffffffff8062420a in dounmount (mp=0xffffff0060f2a000, > flags=1626513408, td=Variable "td" is not available. > ) > at /usr/src/sys/kern/vfs_mount.c:1287 > #19 0xffffffff80624975 in unmount (td=0xffffff0061528000, > uap=0xffffff803e8b5c00) > at /usr/src/sys/kern/vfs_mount.c:1172 > #20 0xffffffff8088783f in syscall (frame=0xffffff803e8b5c90) at /usr/ > src/sys/amd64/amd64/trap.c:984 > #21 0xffffffff8086e290 in Xfast_syscall () at /usr/src/sys/amd64/ > amd64/exception.S:364 > #22 0x000000080104e49c in ?? () > Previous frame inner to this frame (corrupt stack?) I might have hit the same thing again... only it didn't *crash* this time, just freeze! I got a "GEOM_GATE: Device ggateX destroyed" on my console, and it stopped responding to pings, keyboard input, etc. (NOTE: The GEOM_GATE message MAY have been an old one. I *think* it was from the night before but can't be sure...) It obviously happened while running my ugly-hack backup script this time too, since it stopped responding to pings ~02:02AM with the script running at 02:00. I'm not sure where it crashed, since snapshots were NOT taken on the "local box". It usually crashes during export, long after taking the snapshots. BTW, current system is BETA1 r195422M (dtrace timestamp patch + libzfs_sendrecv.c patch ( http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006814.html ). Here's the script in its relevant entirety (I removed the "initial backup" part since it never runs using cron anyway). I realize it's an ugly hack (and that my bash-fu could be stronger), but what the heck. Essentially, it creates a GEOM provider of a file, containing a zpool, imports the pool and creates a clone to it, and then exports the pool. The export appears to be what causes all the trouble - usually, not this time around. Every time it crashes it seems to be during or very soon after the export - only this time it didn't even take the snapshots. #!/bin/bash PATH="$PATH:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/ sbin" function die() { echo "$@" 2>&1 zpool export slave 2>&1 > /dev/null ggatel destroy -u 666 2>&1 > /dev/null exit 1 } function mount_unmount { if [ -z "$1" ]; then die 'Invalid argument given to mount_unmount' elif [[ "$1" == "mount" ]]; then zpool list | grep -q slave if [ "$?" = "0" ]; then echo Already mounted return 0 fi echo Creating ggate device ggatel create -u 666 /mnt/backup/chaos/slavefile || die 'Unable to create GEOM provider from file' echo 'Sleeping for 5 seconds...' sleep 5 echo Importing pool zpool import -R /slave slave || die 'Unable to import slave pool' elif [[ "$1" == "unmount" ]]; then echo Exporting pool zpool export slave || die 'Unable to export slave pool' ggatel destroy -u 666 || die 'Unable to destroy GEOM provider' fi } f [ ! -z "$1" ]; then case $1 in mount) mount_unmount mount; exit 0;; unmount) mount_unmount unmount; exit 0;; initial) initial; exit 0;; backup) ;; *) help;; esac else help fi if [ ! -f "/mnt/backup/chaos/slavefile" ]; then echo 'Backup error! slavefile does not exist!' | mail -s "Backup error" root echo 'Slavefile does not exist!' exit 1 fi mount_unmount mount CURR=$(date +"backup-%F-%H%M") echo Taking snapshots zfs snapshot -r tank@$CURR || die 'Unable to create $CURR snapshot' echo Starting backup... LAST=$(cat /root/.last-backup) zfs send -R -I $LAST tank@$CURR | zfs recv -Fvd slave echo $CURR > /root/.last-backup mount_unmount unmount echo Running rsync rsync -av --delete /bootdir/boot exscape::backup-freebsd/chaos rsync -av --delete /root exscape::backup-freebsd/chaos rsync -av --delete ~serenity exscape::backup-freebsd/chaos echo 'All done!' ------------------- So, in *normal* cases, everything runs just fine. This is the case perhaps 90% of the time. In normal *crashes*, it hangs on export with the above backtrace. This time, all I know is that it crashes soon after starting the backup... during import, perhaps? Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 9 12:36:28 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 0014F1065694; Thu, 9 Jul 2009 12:36:27 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 472748FC20; Thu, 9 Jul 2009 12:36:27 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:38484 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MOsrN-0002ZH-4K; Thu, 09 Jul 2009 14:36:24 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 1F1D03F6A8; Thu, 9 Jul 2009 14:36:22 +0200 (CEST) Message-Id: <9FAC783B-5709-460B-B6DA-364DCD0DE8DA@exscape.org> From: Thomas Backman To: FreeBSD current , freebsd-fs@freebsd.org In-Reply-To: <766FFF07-181A-4180-B020-AA3EE46CF6F8@exscape.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Thu, 9 Jul 2009 14:36:19 +0200 References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> <766FFF07-181A-4180-B020-AA3EE46CF6F8@exscape.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MOsrN-0002ZH-4K. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MOsrN-0002ZH-4K a5e81229e71f083d827cc1ed16bcd84c Cc: Subject: Re: "New" ZFS crash on FS (pool?) unmount/export X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2009 12:36:28 -0000 On Jul 9, 2009, at 11:01, Thomas Backman wrote: > On Jun 20, 2009, at 09:11, Thomas Backman wrote: > >> I just ran into this tonight. Not sure exactly what triggered it - >> the box stopped responding to pings at 02:07AM and it has a cron >> backup job using zfs send/recv at 02:00, so I'm guessing it's >> related, even though the backup probably should have finished >> before then... Hmm. Anyway. >> >> r194478. >> >> kernel trap 12 with interrupts disabled >> >> Fatal trap 12: page fault while in kernel mode >> cpuid = 0; apic id = 00 >> fault virtual address = 0x288 >> fault code = supervisor read data, page not present >> instruction pointer = 0x20:0xffffffff805a4989 >> stack pointer = 0x28:0xffffff803e8b57e0 >> frame pointer = 0x28:0xffffff803e8b5840 >> code segment = base 0x0, limit 0xfffff, type 0x1b >> = DPL 0, pres 1, long 1, def32 0, gran 1 >> processor eflags = resume, IOPL = 0 >> current process = 57514 (zpool) >> panic: from debugger >> cpuid = 0 >> Uptime: 10h22m13s >> Physical memory: 2027 MB >> >> (kgdb) bt >> #0 doadump () at pcpu.h:223 >> #1 0xffffffff8059c409 in boot (howto=260) at /usr/src/sys/kern/ >> kern_shutdown.c:419 >> #2 0xffffffff8059c85c in panic (fmt=Variable "fmt" is not available. >> ) at /usr/src/sys/kern/kern_shutdown.c:575 >> #3 0xffffffff801f1377 in db_panic (addr=Variable "addr" is not >> available. >> ) at /usr/src/sys/ddb/db_command.c:478 >> #4 0xffffffff801f1781 in db_command (last_cmdp=0xffffffff80c38620, >> cmd_table=Variable "cmd_table" is not available. >> ) at /usr/src/sys/ddb/db_command.c:445 >> #5 0xffffffff801f19d0 in db_command_loop () at /usr/src/sys/ddb/ >> db_command.c:498 >> #6 0xffffffff801f3969 in db_trap (type=Variable "type" is not >> available. >> ) at /usr/src/sys/ddb/db_main.c:229 >> #7 0xffffffff805ce465 in kdb_trap (type=12, code=0, >> tf=0xffffff803e8b5730) at /usr/src/sys/kern/subr_kdb.c:534 >> #8 0xffffffff8088715d in trap_fatal (frame=0xffffff803e8b5730, >> eva=Variable "eva" is not available. >> ) at /usr/src/sys/amd64/amd64/trap.c:847 >> #9 0xffffffff80887fb2 in trap (frame=0xffffff803e8b5730) at /usr/ >> src/sys/amd64/amd64/trap.c:345 >> #10 0xffffffff8086e007 in calltrap () at /usr/src/sys/amd64/amd64/ >> exception.S:223 >> #11 0xffffffff805a4989 in _sx_xlock_hard (sx=0xffffff0043557d50, >> tid=18446742975830720512, opts=Variable "opts" is not available. >> ) >> at /usr/src/sys/kern/kern_sx.c:575 >> #12 0xffffffff805a52fe in _sx_xlock (sx=Variable "sx" is not >> available. >> ) at sx.h:155 >> #13 0xffffffff80fe2995 in zfs_freebsd_reclaim () from /boot/kernel/ >> zfs.ko >> #14 0xffffffff808cefca in VOP_RECLAIM_APV (vop=0xffffff0043557d38, >> a=0xffffff0043557d50) at vnode_if.c:1926 >> #15 0xffffffff80626f6e in vgonel (vp=0xffffff00437a7938) at >> vnode_if.h:830 >> #16 0xffffffff8062b528 in vflush (mp=0xffffff0060f2a000, >> rootrefs=0, flags=0, td=0xffffff0061528000) >> at /usr/src/sys/kern/vfs_subr.c:2450 >> #17 0xffffffff80fdd3a8 in zfs_umount () from /boot/kernel/zfs.ko >> #18 0xffffffff8062420a in dounmount (mp=0xffffff0060f2a000, >> flags=1626513408, td=Variable "td" is not available. >> ) >> at /usr/src/sys/kern/vfs_mount.c:1287 >> #19 0xffffffff80624975 in unmount (td=0xffffff0061528000, >> uap=0xffffff803e8b5c00) >> at /usr/src/sys/kern/vfs_mount.c:1172 >> #20 0xffffffff8088783f in syscall (frame=0xffffff803e8b5c90) at / >> usr/src/sys/amd64/amd64/trap.c:984 >> #21 0xffffffff8086e290 in Xfast_syscall () at /usr/src/sys/amd64/ >> amd64/exception.S:364 >> #22 0x000000080104e49c in ?? () >> Previous frame inner to this frame (corrupt stack?) > > > Here's the script in its relevant entirety [...] > > #!/bin/bash > > PATH="$PATH:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/ > sbin" > > function die() { > echo "$@" 2>&1 > zpool export slave 2>&1 > /dev/null > ggatel destroy -u 666 2>&1 > /dev/null > exit 1 > } > > function mount_unmount { > if [ -z "$1" ]; then > die 'Invalid argument given to mount_unmount' > elif [[ "$1" == "mount" ]]; then > zpool list | grep -q slave > if [ "$?" = "0" ]; then > echo Already mounted > return 0 > fi > echo Creating ggate device > ggatel create -u 666 /mnt/backup/chaos/slavefile || die 'Unable to > create GEOM provider from file' > echo 'Sleeping for 5 seconds...' > sleep 5 > echo Importing pool > zpool import -R /slave slave || die 'Unable to import slave pool' > elif [[ "$1" == "unmount" ]]; then > echo Exporting pool > zpool export slave || die 'Unable to export slave pool' > ggatel destroy -u 666 || die 'Unable to destroy GEOM provider' > fi > } > > f [ ! -z "$1" ]; then > case $1 in > mount) mount_unmount mount; exit 0;; > unmount) mount_unmount unmount; exit 0;; > initial) initial; exit 0;; > backup) ;; > *) help;; > esac > else > help > fi > > if [ ! -f "/mnt/backup/chaos/slavefile" ]; then > echo 'Backup error! slavefile does not exist!' | mail -s "Backup > error" root > echo 'Slavefile does not exist!' > exit 1 > fi > > mount_unmount mount > > CURR=$(date +"backup-%F-%H%M") > > echo Taking snapshots > zfs snapshot -r tank@$CURR || die 'Unable to create $CURR snapshot' > > echo Starting backup... > LAST=$(cat /root/.last-backup) > zfs send -R -I $LAST tank@$CURR | zfs recv -Fvd slave > > echo $CURR > /root/.last-backup > > mount_unmount unmount > > echo Running rsync > rsync -av --delete /bootdir/boot exscape::backup-freebsd/chaos > rsync -av --delete /root exscape::backup-freebsd/chaos > rsync -av --delete ~serenity exscape::backup-freebsd/chaos > > echo 'All done!' > > ------------------- Sorry for the monologue, but I ran in to this again, this time with a panic again. Similar but not identical to the old one. OK, so I figured I would update my "untouched" src clone (used to save bandwidth from the FBSD SVN server when I feel I need to start with a *clean* source tree), now that there have been quite a few changes since that revision. I pretty much did this (if other shells are different, !$ in bash is the last argument to the previous command.) 1) Clean up /usr/src from "my" stuff 2) svn update 3) svn diff and svn status, to make sure it was clean 4) zfs promote tank/usr/src ## usr/src was a clone of the untouched, read-only fs "tank/usr/src_r194478-UNTOUCHED" 5) zfs destroy -r tank/usr/src_r194478-UNTOUCHED ## The old one, obviously 6) zfs snapshot tank/usr/src@r195488_UNTOUCHED 7) zfs clone !$ tank/usr/src_r195488-UNTOUCHED 8) zfs promote !$ 9) zfs set readonly=on !$ 10) And, in case it may matter, I slightly modified the contents of / usr/src afterwards (applied two patches that aren't merged into HEAD (yet?)). ... I THINK that's it. Since my bash_history got killed in the panic, I may be slighty wrong. In any case, usr/src is now a clone of the readonly UNTOUCHED fs: [root@chaos ~]# zfs get origin tank/usr/src NAME PROPERTY VALUE SOURCE tank/usr/src origin tank/usr/ src_r195488_UNTOUCHED@r195488_UNTOUCHED - I then ran the backup script just posted in this thread: [root@chaos ~]# bash backup.sh backup Creating ggate device Sleeping for 5 seconds... Importing pool Taking snapshots Starting backup... attempting destroy slave/usr/src_r194478-UNTOUCHED@backup-20090709-1250 success attempting destroy slave/usr/src_r194478-UNTOUCHED@r194478-UNTOUCHED failed - trying rename to slave/usr/src_r194478-UNTOUCHED@recv-38883-1 failed (2) - will try again on next pass attempting destroy slave/usr/src_r194478-UNTOUCHED@backup-20090709-1235 success attempting destroy slave/usr/src_r194478-UNTOUCHED failed - trying rename to slave/recv-38883-2 failed (2) - will try again on next pass promoting slave/usr/src another pass: attempting destroy slave/usr/src_r194478-UNTOUCHED success attempting destroy slave/usr/src@r194478-UNTOUCHED success attempting rename slave/usr/src to slave/usr/src_r195488_UNTOUCHED success receiving incremental stream of tank@backup-20090709-1328 into slave@backup-20090709-1328 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/tmp@backup-20090709-1328 into slave/tmp@backup-20090709-1328 received 119KB stream in 1 seconds (119KB/sec) receiving incremental stream of tank/var@backup-20090709-1328 into slave/var@backup-20090709-1328 received 211KB stream in 1 seconds (211KB/sec) receiving incremental stream of tank/var/log@backup-20090709-1328 into slave/var/log@backup-20090709-1328 received 468KB stream in 1 seconds (468KB/sec) receiving incremental stream of tank/var/crash@backup-20090709-1328 into slave/var/crash@backup-20090709-1328 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/root@backup-20090709-1328 into slave/root@backup-20090709-1328 received 156KB stream in 1 seconds (156KB/sec) receiving incremental stream of tank/usr@backup-20090709-1328 into slave/usr@backup-20090709-1328 received 302KB stream in 1 seconds (302KB/sec) receiving incremental stream of tank/usr/obj@backup-20090709-1328 into slave/usr/obj@backup-20090709-1328 received 8.52MB stream in 8 seconds (1.07MB/sec) receiving incremental stream of tank/usr/ src_r195488_UNTOUCHED@r195488_UNTOUCHED into slave/usr/ src_r195488_UNTOUCHED@r195488_UNTOUCHED received 112MB stream in 43 seconds (2.60MB/sec) receiving incremental stream of tank/usr/ src_r195488_UNTOUCHED@backup-20090709-1328 into slave/usr/ src_r195488_UNTOUCHED@backup-20090709-1328 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/usr/ports@backup-20090709-1328 into slave/usr/ports@backup-20090709-1328 received 312B stream in 1 seconds (312B/sec) receiving incremental stream of tank/usr/ports/ distfiles@backup-20090709-1328 into slave/usr/ports/ distfiles@backup-20090709-1328 received 312B stream in 1 seconds (312B/sec) found clone origin slave/usr/src_r195488_UNTOUCHED@r195488_UNTOUCHED receiving incremental stream of tank/usr/src@backup-20090709-1328 into slave/usr/src@backup-20090709-1328 received 216KB stream in 1 seconds (216KB/sec) local fs slave/usr/src does not have fromsnap (backup-20090709-1250 in stream); must have been deleted locally; ignoring Exporting pool Read from remote host 192.168.1.10: Operation timed out ... and the DDB output (first part copy/paste from kgdb, second part handwritten since the kgdb BT was totally broken, as I expected.) Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x70 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff8036e855 stack pointer = 0x28:0xffffff803ea637d0 frame pointer = 0x28:0xffffff803ea637e0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 38905 (zpool) _sx_slock() dmu_buf_update_user()+0x47 zfs_znode_dmu_fini() zfs_freebsd_reclaim() VOP_RECLAIM_APV() vgonel() vflush() zfs_umount() dounmount() unmount() syscall() Xfast_syscall() BTW, what's with the "local fs slave/usr/src does not have fromsnap (backup-20090709-1250 in stream); must have been deleted locally; ignoring" part? This is what I get when I try an incremental backup now: [root@chaos ~]# bash backup.sh backup Creating ggate device Sleeping for 5 seconds... Importing pool Taking snapshots Starting backup... local fs slave/usr/src does not have fromsnap (backup-20090709-1250 in stream); must have been deleted locally; ignoring receiving incremental stream of tank@backup-20090709-1328 into slave@backup-20090709-1328 snap slave@backup-20090709-1328 already exists; ignoring local fs slave/usr/src does not have fromsnap (backup-20090709-1250 in stream); must have been deleted locally; ignoring warning: cannot send 'tank/tmp@backup-20090709-1328': Broken pipe warning: cannot send 'tank/tmp@backup-20090709-1406': Broken pipe Exporting pool Running rsync ... rsync runs, no panic, but no ZFS backup, either. Guess it's time for another "initial" backup, i.e. start all over with 0 snapshots... The initial backup worked just fine, it found the clone/origin etc. and made it work. Stripped from comments and echo statements, the function is simply: function initial { for BACK in $(zfs list -t snapshot -H -r tank | awk '{print $1}'); do zfs destroy $BACK; done zpool destroy slave 2>/dev/null; sleep 3; ggatel destroy -u 666 2>/dev/null; sleep 3 # Clean up if needed ggatel create -u 666 /mnt/backup/chaos/slavefile; sleep 3 zpool create -f -R /slave slave ggate666 && NOW=$(date +"backup-%Y %m%d-%H%M") || die 'Unable to create pool' zfs snapshot -r tank@$NOW || die 'Unable to snapshot' zfs send -R tank@$NOW | zfs recv -vFd slave mount_unmount unmount echo $NOW > /root/.last-backup } After that, incrementals are fine again. Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Thu Jul 9 14:50:04 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 4CF971065672 for ; Thu, 9 Jul 2009 14:50:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 3B4228FC12 for ; Thu, 9 Jul 2009 14:50:04 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n69Eo3hY009384 for ; Thu, 9 Jul 2009 14:50:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n69Eo3bp009383; Thu, 9 Jul 2009 14:50:03 GMT (envelope-from gnats) Date: Thu, 9 Jul 2009 14:50:03 GMT Message-Id: <200907091450.n69Eo3bp009383@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Spartak Radchenko Cc: Subject: Re: kern/127420: [gjournal] [panic] Journal overflow on gmirrored gjournal X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Spartak Radchenko List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2009 14:50:04 -0000 The following reply was made to PR kern/127420; it has been noted by GNATS. From: Spartak Radchenko To: bug-followup@FreeBSD.org, ruben@verweg.com Cc: Subject: Re: kern/127420: [gjournal] [panic] Journal overflow on gmirrored gjournal Date: Thu, 09 Jul 2009 18:22:55 +0400 I have the same problem. FreeBSD 7.2-RELEASE amd64, gjournal on gmirrored volume (local drive + geom_gate mirrored). I am trying to make something like a HA cluster using freevrrpd, ggate, gmirror and gjournal. It generally works, but every time a server with ggated running goes down (I use hardware reset for testing) first ggate0 device is removed from gmirrored volume on master as it should, next master panics with "gjournal overflow" message. From owner-freebsd-fs@FreeBSD.ORG Thu Jul 9 16:37:38 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 93CCB1065673 for ; Thu, 9 Jul 2009 16:37:38 +0000 (UTC) (envelope-from bounces+305227.47129454.578549@icpbounce.com) Received: from smtp2.icpbounce.com (smtp2.icpbounce.com [216.27.93.124]) by mx1.freebsd.org (Postfix) with ESMTP id CD1FC8FC1D for ; Thu, 9 Jul 2009 16:37:36 +0000 (UTC) (envelope-from bounces+305227.47129454.578549@icpbounce.com) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp2.icpbounce.com (Postfix) with ESMTP id DB75648970 for ; Thu, 9 Jul 2009 12:15:44 -0400 (EDT) Date: Thu, 9 Jul 2009 12:15:44 -0400 To: freebsd-fs@freebsd.org From: Global Access Travel Message-ID: <5549a4f4b9c86258f818ed1e1b583f84@localhost.localdomain> X-Priority: 3 X-Mailer: PHPMailer [version 1.72] Errors-To: bounces+305227.47129454.578549@icpbounce.com X-List-Unsubscribe: X-Unsubscribe-Web: X-ICPINFO: X-Return-Path-Hint: bounces+305227.47129454.578549@icpbounce.com MIME-Version: 1.0 Content-Type: text/plain; charset = "utf-8" Content-Transfer-Encoding: 8bit X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Fam Trip to TURKEY for $999 (Refundable) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 09 Jul 2009 16:37:38 -0000 [http://www.turkeycallingus.com/] Exclusive Boutique Enterprise Turkey FAM ISTANBUL - CAPPADOCIA - KONYA - ANTALYA - PAMUKKALE - KUSADASI 9 Nights / 11 Days $999 • 5 Continents • 150 Countries Worldwide • 100.000 Hotels • Instant Confirmation [http://www.turkeycallingus.com] [http://www.turkeycallingus.com/turkey-fam/TurkeyFam.htm] [http://www.turkeycallingus.com/turkey-fam/TurkeyFamItinerary.htm] [http://www.turkeycallingus.com/turkey-fam/TurkeyFamRates.htm] [http://www.turkeycallingus.com/turkey-fam/TurkeyFamServices.htm] [http://www.turkeycallingus.com/turkey-fam/TurkeyFamHotels.htm] Global Access proudly presents the biggest FAM Trip of the year, teaming with Turkish Airlines and Turkish Ministry of Tourism and Culture. As the host of ASTA IDE 2010 and European Capital of Culture 2010, Turkey is likely to be the one of the most popular destinations in 2010. Those who act early and get to know this beautiful country better will be able to give a better insight to their clients and secure more bookings. Our specially selected travel agents will stay in best hotels in each town, be escorted by professional, top tour guides, taste exceptionally good examples of Turkish Cuisine, and get to know Turkey in elegant way. Join us for a luxury FAM adventure and be our special guest in our beautiful country! COMBINE WITH World Travel Market! One of the biggest travel shows of Europe and the world, WTM, will be held in London between 9-12 November 2009. Combine your London trip with Turkey and benefit from great agent rates to see one of the most popular tourist destinations from USA and Canada. WE WILL REFUND YOUR MONEY BACK ! Upon booking your 20th passenger on a Global Access Travel Service, we will refund you the whole tour price that you’ve paid for the FAM Trip. If you book 20 or more people on a Global Access Travel Service before the FAM Trip starts, then you will travel for free! About Us Global Access Travel (GA) was founded in Turkey by a group of tourism professionals and marketing experts who recognized the needs to offer online services for accommodations, car rentals, and other travel related services to travel agencies. Through its sophisticated online reservation services, GA offers more than 100,000 hotels, motels, resorts, clubs and apartments all around the world. Other services of GA include car rentals, transfers, special tours, luxury services, city breaks, flight tickets and other services such as tailor made tour packages, exhibition organizations, incentives and other travel related services around the globe at competitive rates. [http://www.TurkeyCallingus.com] www.TurkeyCalling.us [http://www.turkeycallingus.com/turkey-calling-contact-us.htm] Global Access Travel Tel: +90 212 258 58 29 Fax: +90 212 258 34 47 E-mail : [mailto:incoming@gaturkey.com] incoming@gaturkey.com Website: [http://www.turkeycallingus.com/] www.TurkeyCalling.Us This message was sent by: FamTrit turkey, Nüzhetiye Cad., istanbul, besiktas 34357, Turkey To be removed click here: http://app.icontact.com/icp/mmail-mprofile.pl?r=47129454&l=82243&s=5T3C&m=578549&c=305227 Forward to a friend: http://app.icontact.com/icp/sub/forward?m=578549&s=47129454&c=5T3C&cid=305227 From owner-freebsd-fs@FreeBSD.ORG Fri Jul 10 03:54:07 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 222F0106564A; Fri, 10 Jul 2009 03:54:07 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id EC0698FC0A; Fri, 10 Jul 2009 03:54:06 +0000 (UTC) (envelope-from linimon@FreeBSD.org) Received: from freefall.freebsd.org (linimon@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6A3s6eZ029579; Fri, 10 Jul 2009 03:54:06 GMT (envelope-from linimon@freefall.freebsd.org) Received: (from linimon@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6A3s69v029575; Fri, 10 Jul 2009 03:54:06 GMT (envelope-from linimon) Date: Fri, 10 Jul 2009 03:54:06 GMT Message-Id: <200907100354.n6A3s69v029575@freefall.freebsd.org> To: linimon@FreeBSD.org, freebsd-bugs@FreeBSD.org, freebsd-fs@FreeBSD.org From: linimon@FreeBSD.org Cc: Subject: Re: kern/136601: [zfs] tar: Couldn't list extended attributes: Read-only file system X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2009 03:54:07 -0000 Old Synopsis: tar: Couldn't list extended attributes: Read-only file system New Synopsis: [zfs] tar: Couldn't list extended attributes: Read-only file system Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Fri Jul 10 03:53:05 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=136601 From owner-freebsd-fs@FreeBSD.ORG Fri Jul 10 19:01:48 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id C9FBA106566B; Fri, 10 Jul 2009 19:01:48 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 464B18FC16; Fri, 10 Jul 2009 19:01:48 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:45621 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MPLLl-0007fh-4v; Fri, 10 Jul 2009 21:01:39 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 8E192F9973; Fri, 10 Jul 2009 21:01:34 +0200 (CEST) Message-Id: From: Thomas Backman To: FreeBSD current In-Reply-To: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 10 Jul 2009 21:01:33 +0200 References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MPLLl-0007fh-4v. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MPLLl-0007fh-4v b688837e5904323df4ea64d43091b07b Cc: freebsd-fs@freebsd.org Subject: Reproducible ZFS panic, w/ script (Was: "New" ZFS crash on FS (pool?) unmount/export) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2009 19:01:49 -0000 On Jun 20, 2009, at 09:11, Thomas Backman wrote: > I just ran into this tonight. Not sure exactly what triggered it - > the box stopped responding to pings at 02:07AM and it has a cron > backup job using zfs send/recv at 02:00, so I'm guessing it's > related, even though the backup probably should have finished before > then... Hmm. Anyway. > > r194478. > > kernel trap 12 with interrupts disabled > > Fatal trap 12: page fault while in kernel mode > cpuid = 0; apic id = 00 > fault virtual address = 0x288 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff805a4989 > stack pointer = 0x28:0xffffff803e8b57e0 > frame pointer = 0x28:0xffffff803e8b5840 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = resume, IOPL = 0 > current process = 57514 (zpool) > panic: from debugger > cpuid = 0 > Uptime: 10h22m13s > Physical memory: 2027 MB > > (kgdb) bt > ... > #9 0xffffffff80887fb2 in trap (frame=0xffffff803e8b5730) at /usr/ > src/sys/amd64/amd64/trap.c:345 > #10 0xffffffff8086e007 in calltrap () at /usr/src/sys/amd64/amd64/ > exception.S:223 > #11 0xffffffff805a4989 in _sx_xlock_hard (sx=0xffffff0043557d50, > tid=18446742975830720512, opts=Variable "opts" is not available. > ) > at /usr/src/sys/kern/kern_sx.c:575 > #12 0xffffffff805a52fe in _sx_xlock (sx=Variable "sx" is not > available. > ) at sx.h:155 > #13 0xffffffff80fe2995 in zfs_freebsd_reclaim () from /boot/kernel/ > zfs.ko > #14 0xffffffff808cefca in VOP_RECLAIM_APV (vop=0xffffff0043557d38, > a=0xffffff0043557d50) at vnode_if.c:1926 > #15 0xffffffff80626f6e in vgonel (vp=0xffffff00437a7938) at > vnode_if.h:830 > #16 0xffffffff8062b528 in vflush (mp=0xffffff0060f2a000, rootrefs=0, > flags=0, td=0xffffff0061528000) > at /usr/src/sys/kern/vfs_subr.c:2450 > #17 0xffffffff80fdd3a8 in zfs_umount () from /boot/kernel/zfs.ko > #18 0xffffffff8062420a in dounmount (mp=0xffffff0060f2a000, > flags=1626513408, td=Variable "td" is not available. > ) > at /usr/src/sys/kern/vfs_mount.c:1287 > #19 0xffffffff80624975 in unmount (td=0xffffff0061528000, > uap=0xffffff803e8b5c00) > at /usr/src/sys/kern/vfs_mount.c:1172 > #20 0xffffffff8088783f in syscall (frame=0xffffff803e8b5c90) at /usr/ > src/sys/amd64/amd64/trap.c:984 > #21 0xffffffff8086e290 in Xfast_syscall () at /usr/src/sys/amd64/ > amd64/exception.S:364 > #22 0x000000080104e49c in ?? () > Previous frame inner to this frame (corrupt stack?) > > BTW, I got a (one) "force unmount is experimental" on the console. > On regular shutdown I usually get one per filesystem, it seems (at > least 10) and this pool should contain exactly as many filesystems > as the root pool since it's a copy of it. On running the backup > script manually post-crash, though, I didn't get any. OK, I've finally written a script that reproduces this panic for me every time (6-7 tries in a row should be good enough, plus one on another box). It would be great to have a few testers - and if you do test it, PLEASE report your results here - positive or negative! The main aim is, of course, to provide ZFS devs with their own core dumps, DDB consoles and whatnot to possibly resolve this issue. It requires: * bash (or something compatible, for the script itself) * a box you're willing to crash. ;) * ~200MB free on your /root/ partition (or just edit the paths at the top of the ugly hack of a script.) If you use ggatel with silly high numbers (1482 and 1675 - chosen since they're unlikely to be used), again, edit at the top. * The libzfs_sendrecv.c patch - see http://lists.freebsd.org/pipermail/freebsd-current/2009-May/006814.html or fetch the patch from my server (if it's down, it's for less than 2 minutes - my modem restarts every 181 minutes atm...): http://exscape.org/temp/libzfs_sendrecv.patch Here's a oneliner to fetch, patch, compile and install: cd /usr/src && fetch http://exscape.org/temp/libzfs_sendrecv.patch && patch -p0 < libzfs_sendrecv.patch && cd /usr/src/cddl/lib/libzfs && make && make install No reboot required (nor do you need a reboot to revert, see the end of the mail). PLEASE NOTE that without this patch, you'll just get a segfault and then an infinite loop of backups (since send/recv doesn't work in HEAD since at least february, I'm guessing since ZFS v13). Too bad that the patch is needed, since I suspect quite a few testers will bail out there... If not (great!), here's the script to wreck havoc: http://exscape.org/temp/zfs_clone_panic.sh After fetching the above, just run the script like "bash zfs_clone_panic.sh crash" and you should have a panic in about 20 seconds. The other possiblee arguments are mostly there because I'm too lazy to clean it up now that it "works". Back to the panic: The problem appears to be related to clones somehow - the first two times I ran in to this panic (in real use) was when messing with clone/ promote... so that's what this script does. Here's the backtrace it produces, and some info (show lockedvnods, if that helps at all): kernel trap 12 with interrupts disabled Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x288 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8036df39 stack pointer = 0x28:0xffffff803ea6d7e0 frame pointer = 0x28:0xffffff803ea6d840 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 16367 (zpool) 0xffffff000510fce8: tag zfs, type VDIR usecount 1, writecount 0, refcount 1 mountedhere 0xffffff0002cd8bc0 flags () lock type zfs: EXCL by thread 0xffffff0024c5d000 (pid 16367) 0xffffff00050dc3b0: tag zfs, type VDIR usecount 0, writecount 0, refcount 1 mountedhere 0 flags (VI_DOOMED) VI_LOCKed lock type zfs: EXCL by thread 0xffffff0024c5d000 (pid 16367) panic: from debugger ... ... boot, panic, trap etc ... #10 0xffffffff805d36a7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:223 #11 0xffffffff8036df39 in _sx_xlock_hard (sx=0xffffff005d11a8e8, tid=18446742974814867456, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_sx.c:575 #12 0xffffffff8036e8ae in _sx_xlock (sx=Variable "sx" is not available. ) at sx.h:155 #13 0xffffffff80b889e5 in zfs_freebsd_reclaim () from /boot/kernel/ zfs.ko #14 0xffffffff8062447a in VOP_RECLAIM_APV (vop=0xffffff005d11a8d0, a=0xffffff005d11a8e8) at vnode_if.c:1926 #15 0xffffffff803f3b0e in vgonel (vp=0xffffff00050dc3b0) at vnode_if.h: 830 #16 0xffffffff803f80c8 in vflush (mp=0xffffff0002cd8bc0, rootrefs=0, flags=0, td=0xffffff0024c5d000) at /usr/src/sys/kern/vfs_subr.c:2449 #17 0xffffffff80b833d8 in zfs_umount () from /boot/kernel/zfs.ko #18 0xffffffff803f0d3a in dounmount (mp=0xffffff0002cd8bc0, flags=47025088, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1289 #19 0xffffffff803f1568 in unmount (td=0xffffff0024c5d000, uap=0xffffff803ea6dc00) at /usr/src/sys/kern/vfs_mount.c:1174 #20 0xffffffff805ed4cf in syscall (frame=0xffffff803ea6dc90) at /usr/src/sys/amd64/amd64/trap.c:984 #21 0xffffffff805d3930 in Xfast_syscall () at /usr/src/sys/amd64/ amd64/exception.S:364 #22 0x000000080104e9ac in ?? () Previous frame inner to this frame (corrupt stack?) Regards, Thomas PS. A oneliner to get back to the non-patched state, if you're using SVN (if not, sorry): cd /usr/src && svn revert cddl/contrib/opensolaris/lib/libzfs/common/ libzfs_sendrecv.c && cd /usr/src/cddl/lib/libzfs && make && make install DS. Hope this helps track this down, as I spent quite a while on finding the root cause (the clones, in one way or another), writing the script, this mail, etc. From owner-freebsd-fs@FreeBSD.ORG Fri Jul 10 19:25:41 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B261A106566B; Fri, 10 Jul 2009 19:25:41 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 6BBC08FC22; Fri, 10 Jul 2009 19:25:41 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:43500 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MPLix-0002we-5H; Fri, 10 Jul 2009 21:25:37 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 92DF311E9F0; Fri, 10 Jul 2009 21:25:35 +0200 (CEST) Message-Id: <45291598-D091-4E90-B968-22E59BEB3846@exscape.org> From: Thomas Backman To: FreeBSD current In-Reply-To: Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 10 Jul 2009 21:25:34 +0200 References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MPLix-0002we-5H. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MPLix-0002we-5H d61ea09685ba684f741fc63673280140 Cc: freebsd-fs@freebsd.org Subject: Re: Reproducible ZFS panic, w/ script (Was: "New" ZFS crash on FS (pool?) unmount/export) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2009 19:25:42 -0000 On Jul 10, 2009, at 21:01, Thomas Backman wrote: > OK, I've finally written a script that reproduces this panic for me > every time (6-7 tries in a row should be good enough, plus one on > another box). It would be great to have a few testers - and if you > do test it, PLEASE report your results here - positive or negative! > The main aim is, of course, to provide ZFS devs with their own core > dumps, DDB consoles and whatnot to possibly resolve this issue. > [...] > Back to the panic: > The problem appears to be related to clones somehow - the first two > times I ran in to this panic (in real use) was when messing with > clone/promote... so that's what this script does. Damnit. Very sorry for the noise, but I just noticed that it IS NOT related to the clones. It crashes with the clone/promote lines (#77-78) commented out, too... Now I'm stumped as to where the issue is, I hope the previous mail can help track it down. A very similar setup runs every night, and it "only" crashes about one time in 10 or so. This crashes *every* time and I don't really see the difference between the commands the scripts run. Oh well... Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Fri Jul 10 19:27:45 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 606871065673; Fri, 10 Jul 2009 19:27:45 +0000 (UTC) (envelope-from mat.macy@gmail.com) Received: from an-out-0708.google.com (an-out-0708.google.com [209.85.132.247]) by mx1.freebsd.org (Postfix) with ESMTP id 00A2F8FC24; Fri, 10 Jul 2009 19:27:44 +0000 (UTC) (envelope-from mat.macy@gmail.com) Received: by an-out-0708.google.com with SMTP id d14so573354and.13 for ; Fri, 10 Jul 2009 12:27:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type:content-transfer-encoding; bh=uP3vPsKsWUB9ojOKhaG4bPYmyX9PDs6vzTSyugPrcAc=; b=i7eo2LL4SyIzM0OwEJ8kHo59EKz1jitBapP6KLgQExA+SmvlznNUHtfAvrs77hdjYW 5LE7kEmMghGDVWt7AqaPkHT4q6o8GbMrrcPUzzDKxlQckDdv6yo6hEsRNJSJiMSYYlA0 7qRu000lzTtT9E3wGy8iVvF+DZPnFipuz6sgw= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=jYXpMJv7rmiv4ajZqOSfqXDyrSbK5VcJJXQXYtq+0UuIg5Is1Ptp0Hb3m3ZyCTbGgX U+z7TW7Dwj2JVkAiHKCo3qV/IUGvjRUlAuEqQQIUUA7JBwk5SVQaMyKan4ncTvuBcR9W A5lEoERfCx9LKl6szwjMMNEcWayGb56idZwFw= MIME-Version: 1.0 Sender: mat.macy@gmail.com Received: by 10.100.254.12 with SMTP id b12mr3210057ani.43.1247254064410; Fri, 10 Jul 2009 12:27:44 -0700 (PDT) In-Reply-To: <45291598-D091-4E90-B968-22E59BEB3846@exscape.org> References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> <45291598-D091-4E90-B968-22E59BEB3846@exscape.org> Date: Fri, 10 Jul 2009 12:27:44 -0700 X-Google-Sender-Auth: 85fa16ab9ade5e38 Message-ID: <3c1674c90907101227ueab78eem6f8c5c7fdf0337cc@mail.gmail.com> From: Kip Macy To: Thomas Backman Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: freebsd-fs@freebsd.org, FreeBSD current Subject: Re: Reproducible ZFS panic, w/ script (Was: "New" ZFS crash on FS (pool?) unmount/export) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2009 19:27:46 -0000 "zfs export" does a forced unmount. We may not be properly handling dangling references. -Kip On Fri, Jul 10, 2009 at 12:25 PM, Thomas Backman wrote: > On Jul 10, 2009, at 21:01, Thomas Backman wrote: >> >> OK, I've finally written a script that reproduces this panic for me every >> time (6-7 tries in a row should be good enough, plus one on another box). It >> would be great to have a few testers - and if you do test it, PLEASE report >> your results here - positive or negative! >> The main aim is, of course, to provide ZFS devs with their own core dumps, >> DDB consoles and whatnot to possibly resolve this issue. >> [...] >> Back to the panic: >> The problem appears to be related to clones somehow - the first two times >> I ran in to this panic (in real use) was when messing with clone/promote... >> so that's what this script does. > > Damnit. Very sorry for the noise, but I just noticed that it IS NOT related > to the clones. It crashes with the clone/promote lines (#77-78) commented > out, too... > Now I'm stumped as to where the issue is, I hope the previous mail can help > track it down. > A very similar setup runs every night, and it "only" crashes about one time > in 10 or so. This crashes *every* time and I don't really see the difference > between the commands the scripts run. Oh well... > > Regards, > Thomas > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > -- When bad men combine, the good must associate; else they will fall one by one, an unpitied sacrifice in a contemptible struggle. Edmund Burke From owner-freebsd-fs@FreeBSD.ORG Fri Jul 10 20:01:30 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 477AA106568A; Fri, 10 Jul 2009 20:01:30 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id F16368FC19; Fri, 10 Jul 2009 20:01:29 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:55549 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MPMHJ-0005mt-57; Fri, 10 Jul 2009 22:01:08 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 3FED21386E2; Fri, 10 Jul 2009 22:01:05 +0200 (CEST) Message-Id: From: Thomas Backman To: Kip Macy In-Reply-To: <3c1674c90907101227ueab78eem6f8c5c7fdf0337cc@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Fri, 10 Jul 2009 22:01:04 +0200 References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> <45291598-D091-4E90-B968-22E59BEB3846@exscape.org> <3c1674c90907101227ueab78eem6f8c5c7fdf0337cc@mail.gmail.com> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MPMHJ-0005mt-57. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MPMHJ-0005mt-57 c319037c8c265a456fa9b1f910fe5d59 Cc: freebsd-fs@freebsd.org, FreeBSD current Subject: Re: Reproducible ZFS panic, w/ script (Was: "New" ZFS crash on FS (pool?) unmount/export) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jul 2009 20:01:30 -0000 On Jul 10, 2009, at 21:27, Kip Macy wrote: > "zfs export" does a forced unmount. We may not be properly handling > dangling references. > > -Kip > > On Fri, Jul 10, 2009 at 12:25 PM, Thomas > Backman wrote: >> On Jul 10, 2009, at 21:01, Thomas Backman wrote: >> ... Just one more thing to add for me today: the crash always happens when exporting the slave. Constant send/recv loops multiple times a second, no sweat. import/export of both pools multiple times a second, without any send/recv in between them, no sweat. Combined, however, it panics on "zpool export crashtestslave". (I verified this twice, once by changing stress() to simply run loads of incremental backups for a for a few minutes, break, and export the pools manually. Both times, the master pool was no problem, and it immediately panics on exporting the slave.) Regards, Thomas From owner-freebsd-fs@FreeBSD.ORG Sat Jul 11 02:30:03 2009 Return-Path: Delivered-To: freebsd-fs@hub.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 64B061065670 for ; Sat, 11 Jul 2009 02:30:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 5289A8FC12 for ; Sat, 11 Jul 2009 02:30:03 +0000 (UTC) (envelope-from gnats@FreeBSD.org) Received: from freefall.freebsd.org (gnats@localhost [127.0.0.1]) by freefall.freebsd.org (8.14.3/8.14.3) with ESMTP id n6B2U3fk090383 for ; Sat, 11 Jul 2009 02:30:03 GMT (envelope-from gnats@freefall.freebsd.org) Received: (from gnats@localhost) by freefall.freebsd.org (8.14.3/8.14.3/Submit) id n6B2U3w8090380; Sat, 11 Jul 2009 02:30:03 GMT (envelope-from gnats) Date: Sat, 11 Jul 2009 02:30:03 GMT Message-Id: <200907110230.n6B2U3w8090380@freefall.freebsd.org> To: freebsd-fs@FreeBSD.org From: Anonymous Cc: Subject: Re: kern/129148: [zfs] [panic] panic on concurrent writing & rollback X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Anonymous List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Jul 2009 02:30:03 -0000 The following reply was made to PR kern/129148; it has been noted by GNATS. From: Anonymous To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/129148: [zfs] [panic] panic on concurrent writing & rollback Date: Sat, 11 Jul 2009 06:28:03 +0400 This panic makes rollback feature really unusable. I usually do: # cd /usr/src (svn checkout sources) # zfs snapshot q/usr/src@blah # zcat ~/some_big_patch.bz2 | patch -Efsp0 -F0 ... # zfs rollback q/usr/src@blah # zcat ~/another_big_patch.bz2 | patch -Efsp0 -F0 BANG! panics here Here is recent one for FreeBSD 8.0-BETA1 #0: Sat Jul 4 03:55:14 UTC 2009 root@almeida.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC i386 --- panic begins here --- $ qemu -no-kqemu -echr 3 -nographic \ -hda /dev/zvol/h/home/luser/freebsd-i386 \ -hdb /dev/zvol/h/home/luser/freebsd-i386-zpool [...] # zpool create q ad1 # sh crash.sh cannot open 'q/test': dataset does not exist load: 0.90 cmd: sh 66 [runnable] 2.53r 0.28u 1.92s 29% 1808k Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x4c fault code = supervisor read, page not present instruction pointer = 0x20:0xc087c9c3 stack pointer = 0x28:0xc89f5790 frame pointer = 0x28:0xc89f57b0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags = interrupt enabled, IOPL = 0 current process = 66 (sh) [thread pid 66 tid 100041 ] Stopped at _sx_xlock+0x43: movl 0x10(%ebx),%eax db> show all locks Process 66 (sh) thread 0xc238db40 (100041) exclusive lockmgr zfs (zfs) r = 0 (0xc2613270) locked @ /usr/src/sys/kern/vfs_subr.c:880 exclusive lockmgr zfs (zfs) r = 0 (0xc2613594) locked @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:856 db> show lockedvnods Locked vnodes 0xc261353c: tag zfs, type VDIR usecount 2, writecount 0, refcount 2 mountedhere 0 flags (VV_ROOT) lock type zfs: EXCL by thread 0xc238db40 (pid 66) 0xc2613218: tag zfs, type VREG usecount 0, writecount 0, refcount 1 mountedhere 0 flags (VI_DOOMED) lock type zfs: EXCL by thread 0xc238db40 (pid 66) db> show all pcpu Current CPU: 0 cpuid = 0 dynamic pcpu = 0x6aed54 curthread = 0xc238db40: pid 66 "sh" curpcb = 0xc89f5d90 fpcurthread = 0xc238db40: pid 66 "sh" idlethread = 0xc2156b40: pid 11 "idle: cpu0" APIC ID = 0 currentldt = 0x50 spin locks held: db> bt Tracing pid 66 tid 100041 td 0xc238db40 _sx_xlock(3c,0,c24ba14d,70f,c2617ae0,...) at _sx_xlock+0x43 dmu_buf_update_user(0,c2617ae0,0,0,0,...) at dmu_buf_update_user+0x35 zfs_znode_dmu_fini(c2617ae0,c24c2fed,1114,110b,c26d5000,...) at zfs_znode_dmu_fini+0x43 zfs_freebsd_reclaim(c89f5858,1,0,c2613218,c89f587c,...) at zfs_freebsd_reclaim+0xc0 VOP_RECLAIM_APV(c24c65a0,c89f5858,0,0,c261328c,...) at VOP_RECLAIM_APV+0xa5 vgonel(c261328c,0,c0c6567e,386,0,...) at vgonel+0x1a4 vnlru_free(c0f16ef0,0,c0c6567e,3a1,c2da47ac,...) at vnlru_free+0x2d5 getnewvnode(c24c0cfc,c237778c,c24c65a0,c89f58fc,c25baa80,...) at getnewvnode+0x4a zfs_znode_cache_constructor(c25bbe00,2,c24c1357,2fd,c2db8880,...) at zfs_znode_cache_constructor+0x2e zfs_znode_alloc(c26d5498,0,c24c1357,2fd,c89f5978,...) at zfs_znode_alloc+0x35 zfs_mknode(c2617bc8,c89f5a60,c2d97000,c2152080,0,...) at zfs_mknode+0x286 zfs_freebsd_create(c89f5ac8,c89f5ae0,0,0,c89f5ba8,...) at zfs_freebsd_create+0x722 VOP_CREATE_APV(c24c65a0,c89f5ac8,c89f5bd4,c89f5a60,0,...) at VOP_CREATE_APV+0xa5 vn_open_cred(c89f5ba8,c89f5c5c,1a4,0,c2152080,...) at vn_open_cred+0x200 vn_open(c89f5ba8,c89f5c5c,1a4,c238f310,14,...) at vn_open+0x3b kern_openat(c238db40,ffffff9c,28304378,0,602,...) at kern_openat+0x118 kern_open(c238db40,28304378,0,601,1b6,...) at kern_open+0x35 open(c238db40,c89f5cf8,c,c0c5ee4b,c0d3c0ac,...) at open+0x30 syscall(c89f5d38) at syscall+0x2a3 Xint0x80_syscall() at Xint0x80_syscall+0x20 --- syscall (5, FreeBSD ELF32, open), eip = 0x281d8fe3, esp = 0xbfbfe6cc, ebp = 0xbfbfe768 --- db> ps pid ppid pgrp uid state wmesg wchan cmd 69 0 0 0 SL tq->tq_d 0xc257141c [zil_clean] 68 67 60 0 S+ tx->tx_s 0xc27599e0 zfs 67 60 60 0 S+ wait 0xc25de7f8 sh 66 60 60 0 R+ CPU 0 sh 60 19 60 0 S+ wait 0xc25ded48 sh 57 0 0 0 SL tq->tq_d 0xc25715a4 [zil_clean] 54 0 0 0 SL zio->io_ 0xc265ab94 [txg_thread_enter] 53 0 0 0 SL tx->tx_q 0xc27599e8 [txg_thread_enter] 52 0 0 0 RL [vdev:worker ad1] 51 0 0 0 SL tq->tq_d 0xc2571668 [spa_zio] 50 0 0 0 SL tq->tq_d 0xc257172c [spa_zio] 49 0 0 0 SL tq->tq_d 0xc25717f0 [spa_zio] 48 0 0 0 SL tq->tq_d 0xc25718b4 [spa_zio] 47 0 0 0 SL tq->tq_d 0xc2571978 [spa_zio] 46 0 0 0 SL tq->tq_d 0xc2571a3c [spa_zio] 45 0 0 0 SL tq->tq_d 0xc2571b00 [spa_zio] 44 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 43 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 42 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 41 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 40 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 39 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 38 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 37 0 0 0 SL tq->tq_d 0xc2571bc4 [spa_zio] 36 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 35 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 34 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 33 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 32 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 31 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 30 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 29 0 0 0 SL tq->tq_d 0xc2571c88 [spa_zio] 28 0 0 0 SL tq->tq_d 0xc2571d4c [spa_zio] 27 0 0 0 SL tq->tq_d 0xc2571e10 [spa_zio] 26 0 0 0 SL tq->tq_d 0xc2571ed4 [spa_zio] 25 0 0 0 RL [l2arc_feed_thread] 24 0 0 0 RL [arc_reclaim_thread] 23 0 0 0 RL [vaclean] 22 0 0 0 SL tq->tq_d 0xc2572048 [system_taskq] 19 1 19 0 Ss+ wait 0xc2376550 sh 18 0 0 0 SL flowclea 0xc0daa4a4 [flowcleaner] 17 0 0 0 RL [softdepflush] 16 0 0 0 RL [vnlru] 15 0 0 0 RL [syncer] 14 0 0 0 RL [bufdaemon] 9 0 0 0 SL pgzero 0xc0f23314 [pagezero] 8 0 0 0 SL psleep 0xc0f22f3c [vmdaemon] 7 0 0 0 RL [pagedaemon] 6 0 0 0 SL waiting_ 0xc0f18d5c [sctp_iterator] 5 0 0 0 SL ccb_scan 0xc0d76fd4 [xpt_thrd] 13 0 0 0 SL - 0xc0daa4a4 [yarrow] 4 0 0 0 SL - 0xc0da8264 [g_down] 3 0 0 0 SL - 0xc0da8260 [g_up] 2 0 0 0 SL - 0xc0da8258 [g_event] 12 0 0 0 WL (threaded) intr 100029 I [swi0: uart] 100028 I [irq7: ppc0] 100027 I [irq12: psm0] 100026 I [irq1: atkbd0] 100025 I [irq11: ed0] 100024 I [irq15: ata1] 100023 I [irq14: ata0] 100022 I [irq9: acpi0] 100021 I [swi6: task queue] 100020 I [swi6: Giant taskq] 100018 I [swi5: +] 100013 I [swi2: cambio] 100006 I [swi3: vm] 100005 I [swi4: clock] 100004 I [swi1: netisr 0] 11 0 0 0 RL [idle: cpu0] 1 0 1 0 SLs wait 0xc2154d48 [init] 10 0 0 0 SL audit_wo 0xc0f21f80 [audit] 0 0 0 0 SLs (threaded) kernel 100019 D - 0xc2220d40 [thread taskq] 100017 D - 0xc2221100 [kqueue taskq] 100016 D - 0xc2221140 [acpi_task_2] 100015 D - 0xc2221140 [acpi_task_1] 100014 D - 0xc2221140 [acpi_task_0] 100010 D - 0xc2138940 [firmware taskq] 100000 D sched 0xc0da8320 [swapper] --- panic ends here --- From owner-freebsd-fs@FreeBSD.ORG Sat Jul 11 14:08:49 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B1D35106564A; Sat, 11 Jul 2009 14:08:49 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from ch-smtp01.sth.basefarm.net (ch-smtp01.sth.basefarm.net [80.76.149.212]) by mx1.freebsd.org (Postfix) with ESMTP id 2D2BC8FC15; Sat, 11 Jul 2009 14:08:49 +0000 (UTC) (envelope-from serenity@exscape.org) Received: from c83-253-252-234.bredband.comhem.se ([83.253.252.234]:52658 helo=mx.exscape.org) by ch-smtp01.sth.basefarm.net with esmtp (Exim 4.69) (envelope-from ) id 1MPdFd-0000Yg-3F; Sat, 11 Jul 2009 16:08:31 +0200 Received: from [192.168.1.5] (macbookpro [192.168.1.5]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by mx.exscape.org (Postfix) with ESMTPSA id 0A8405ECFC; Sat, 11 Jul 2009 16:08:29 +0200 (CEST) Message-Id: From: Thomas Backman To: Kip Macy In-Reply-To: <3c1674c90907101227ueab78eem6f8c5c7fdf0337cc@mail.gmail.com> Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes Content-Transfer-Encoding: 7bit Mime-Version: 1.0 (Apple Message framework v935.3) Date: Sat, 11 Jul 2009 16:08:26 +0200 References: <72163521-40BF-4764-8B74-5446A88DFBF8@exscape.org> <45291598-D091-4E90-B968-22E59BEB3846@exscape.org> <3c1674c90907101227ueab78eem6f8c5c7fdf0337cc@mail.gmail.com> X-Mailer: Apple Mail (2.935.3) X-Originating-IP: 83.253.252.234 X-Scan-Result: No virus found in message 1MPdFd-0000Yg-3F. X-Scan-Signature: ch-smtp01.sth.basefarm.net 1MPdFd-0000Yg-3F d1f60bc2a28a5c5ed1e432016c4e6079 Cc: freebsd-fs@freebsd.org, FreeBSD current Subject: Re: Reproducible ZFS panic, w/ script (Was: "New" ZFS crash on FS (pool?) unmount/export) X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 11 Jul 2009 14:08:50 -0000 On Jul 10, 2009, at 21:27, Kip Macy wrote: > "zfs export" does a forced unmount. We may not be properly handling > dangling references. > > -Kip A bit more digging: [root@chaos ~]# bash zfs_crash.sh initial [root@chaos ~]# bash zfs_crash.sh stress ## with the unmount part (line 107) **commented out** I then let the above run for say 20 seconds to create a bunch of snapshots (ignoring errors; in my own script I added a random number to the snapshot name to avoid collisions), and then: [root@chaos ~]# zpool export crashtestmaster [root@chaos ~]# zfs list NAME USED AVAIL REFER MOUNTPOINT crashtestslave 20.3M 40.7M 20K /crashtestslave/ crashtestslave crashtestslave/test_cloned 19.8M 40.7M 19.8M /crashtestslave/ crashtestslave/test_cloned crashtestslave/test_orig 0 40.7M 19.8M /crashtestslave/ crashtestslave/test_orig tank 5.67G 59.3G 18K none tank/root 616M 59.3G 224M / tank/... [root@chaos ~]# zfs unmount crashtestslave/test_orig [root@chaos ~]# zfs unmount crashtestslave/test_cloned [root@chaos ~]# zfs unmount crashtestslave ... panic here. Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xc fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff803a5682 stack pointer = 0x28:0xffffff803ea09980 frame pointer = 0x28:0xffffff803ea099b0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 5099 (zfs) 0xffffff002ac4a938: tag zfs, type VDIR usecount 1, writecount 0, refcount 1 mountedhere 0xffffff00068be8d0 flags () lock type zfs: EXCL by thread 0xffffff0006f13390 (pid 5099) BT: ... #9 0xffffffff805edc42 in trap (frame=0xffffff803ea098d0) at /usr/src/ sys/amd64/amd64/trap.c:345 #10 0xffffffff805d36a7 in calltrap () at /usr/src/sys/amd64/amd64/ exception.S:223 #11 0xffffffff803a5682 in propagate_priority (td=0xffffff0027174ab0) at /usr/src/sys/kern/subr_turnstile.c:194 #12 0xffffffff803a64ec in turnstile_wait (ts=Variable "ts" is not available. ) at /usr/src/sys/kern/subr_turnstile.c:738 #13 0xffffffff80355101 in _mtx_lock_sleep (m=0xffffff002ca6d9f8, tid=18446742974314394512, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_mutex.c:447 #14 0xffffffff803f7893 in vfs_msync (mp=0xffffff00068be8d0, flags=1) at /usr/src/sys/kern/vfs_subr.c:3179 #15 0xffffffff803f0c7e in dounmount (mp=0xffffff00068be8d0, flags=0, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1263 #16 0xffffffff803f1568 in unmount (td=0xffffff0006f13390, uap=0xffffff803ea09c00) at /usr/src/sys/kern/vfs_mount.c:1174 #17 0xffffffff805ed4cf in syscall (frame=0xffffff803ea09c90) at /usr/ src/sys/amd64/amd64/trap.c:984 #18 0xffffffff805d3930 in Xfast_syscall () at /usr/src/sys/amd64/amd64/ exception.S:364 #19 0x0000000800f4b9ac in ?? () Previous frame inner to this frame (corrupt stack?) NOT the same backtrace as before (nothing after dounmount() is the same as the zpool export panic), and this time from zfs unmount, not zpool export. I tried it again, and got another backtrace(!) - it "ends" (or begins, depending on your view) with propagate_priority(), turnstile_wait() and _mtx_lock_sleep() in both cases, though. Here's the second, which happened while doing the same as above - initial, stress and then manually zfs unmount the them. "zfs unmount crashtestslave" (the root fs) is what panics yet again: Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0xc fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff803aa722 stack pointer = 0x28:0xffffff8000025a60 frame pointer = 0x28:0xffffff8000025a90 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = resume, IOPL = 0 current process = 12 (swi4: clock) ... #8 0xffffffff805f1fcd in trap_fatal (frame=0xffffff80000259b0, eva=Variable "eva" is not available. ) at /usr/src/sys/amd64/amd64/trap.c:847 #9 0xffffffff805f2e22 in trap (frame=0xffffff80000259b0) at /usr/src/ sys/amd64/amd64/trap.c:345 #10 0xffffffff805d87c7 in calltrap () at /usr/src/sys/amd64/amd64/ exception.S:224 #11 0xffffffff803aa722 in propagate_priority (td=0xffffff00296ce390) at /usr/src/sys/kern/subr_turnstile.c:194 #12 0xffffffff803ab58c in turnstile_wait (ts=Variable "ts" is not available. ) at /usr/src/sys/kern/subr_turnstile.c:738 #13 0xffffffff8035a1c1 in _mtx_lock_sleep (m=0xffffffff808a1de0, tid=18446742974234830624, opts=Variable "opts" is not available. ) at /usr/src/sys/kern/kern_mutex.c:447 #14 0xffffffff8037ea92 in softclock (arg=Variable "arg" is not available. ) at /usr/src/sys/kern/kern_timeout.c:376 #15 0xffffffff803417b0 in intr_event_execute_handlers (p=Variable "p" is not available. ) at /usr/src/sys/kern/kern_intr.c:1165 #16 0xffffffff80342d1e in ithread_loop (arg=0xffffff000231e6a0) at / usr/src/sys/kern/kern_intr.c:1178 #17 0xffffffff8033ebb8 in fork_exit (callout=0xffffffff80342c90 , arg=0xffffff000231e6a0, frame=0xffffff8000025c80) at /usr/src/sys/kern/kern_fork.c:842 #18 0xffffffff805d8c9e in fork_trampoline () at /usr/src/sys/amd64/ amd64/exception.S:561 #19 0x0000000000000000 in ?? () #20 0x0000000000000000 in ?? () #21 0x0000000000000001 in ?? () #22 0x0000000000000000 in ?? () #23 0x0000000000000000 in ?? () #24 0x0000000000000000 in ?? () #25 0x0000000000000000 in ?? () Note that the active process is *not* zfs this time. Regards, Thomas