From owner-freebsd-current@FreeBSD.ORG Mon Aug 18 11:42:11 2008 Return-Path: Delivered-To: current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9A7A41065679 for ; Mon, 18 Aug 2008 11:42:11 +0000 (UTC) (envelope-from colin@lefty.tv) Received: from lefty.tv (lefty.tv [216.18.67.189]) by mx1.freebsd.org (Postfix) with ESMTP id 8F3768FC16 for ; Mon, 18 Aug 2008 11:42:11 +0000 (UTC) (envelope-from colin@lefty.tv) Received: from [192.168.1.196] (d75-157-200-167.bchsia.telus.net [75.157.200.167]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by lefty.tv (Postfix) with ESMTP id C9E661CC20; Mon, 18 Aug 2008 04:27:54 -0700 (PDT) Message-ID: <48A95C6F.2010002@lefty.tv> Date: Mon, 18 Aug 2008 04:26:39 -0700 From: Colin Moller User-Agent: Thunderbird 2.0.0.16 (Windows/20080708) MIME-Version: 1.0 To: current@freebsd.org Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Mailman-Approved-At: Mon, 18 Aug 2008 15:24:51 +0000 Cc: swank@storefront.com, colin@storefront.com, elo@storefront.com Subject: zpool import hanging on unexpectedly-rebooted machine X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Aug 2008 11:42:11 -0000 Hey all, I've got an interestingly frustrating problem on my hands with our 7.0-STABLE boxes running ZFS. Sun X4500 box running amd64, 16GB of RAM., 46x1TB disks in RAIDZ1. (other two for the OS.) Uname for the box is: FreeBSD sf-nas1-c160a.storefront.com 7.0-STABLE FreeBSD 7.0-STABLE #1: Sat May 31 14:54:22 PDT 2008 root@sf-nas1-c160a.storefront.com:/usr/obj/usr/src/sys/X4500 amd64 The box has been running relatively reliably for some months now, but our hosting provider decided to reboot it on us without asking. After the box came back, it had lost /boot/zfs/zpool.cache, so I needed to reimport the only zpool on the machine (named zfsdata). Running zpool import gives me the output I'm expecting, showing a single zpool called zfsdata, status of ONLINE, and all the disks are showing up. However, when I run zpool import -f , the zpool command simply hangs up with no disk and no CPU activity. I've run truss on the zpool import, and the last thing I see happening is: open("/dev/ad96",O_RDONLY,030115000) = 6 (0x6) ioctl(6,DIOCGIDENT,0xffff9480) = 0 (0x0) close(6) = 0 (0x0) After turning on vfs.zfs.debug, I also see this on the console: zfs_ereport_post:293[1]: time=1219057172.795893475 ereport_version=0 class=fs.zfs.checksum zfs_scheme_version=0 pool=zfsdata pool_guid=316648131406719055 pool_context=2 vdev_guid=7326417523786577584 vdev_type=disk vdev_path=/dev/ad12 vdev_devid=ad:GTF000PAHX5TMF parent_guid=6708978418893991394 parent_type=raidz zio_err=0 zio_offset=89290496000 zio_size=512 zio_object=132 zio_level=0 zio_blkid=244 I also get a boatload of: GEOM: ad46: GPT rejected -- may not be recoverable. GEOM: ad48: corrupt or invalid GPT detected. for each disk in the zpool. This has been happening since we installed the machine, though, so I'm not quite sure it's related. The other Thumper we have that's configured identically also complains about the GPT. I'm assuming these messages appear because ZFS uses its own on-disk format that GEOM doesn't understand yet. Help! I've got several terabytes of data sitting there that I'd really like to get my hands on. How can I get past this deadlock? I'm willing to rebuild kernel and world on the box to get (potentially) newer zfs code, but going to -CURRENT is really not an option at the moment. Thanks! Colin -- Colin Moller colin@lefty.tv