From owner-freebsd-fs@FreeBSD.ORG Fri Nov 8 19:07:29 2013 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTP id 22F59E1 for ; Fri, 8 Nov 2013 19:07:29 +0000 (UTC) (envelope-from artemb@gmail.com) Received: from mail-vb0-x22a.google.com (mail-vb0-x22a.google.com [IPv6:2607:f8b0:400c:c02::22a]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by mx1.freebsd.org (Postfix) with ESMTPS id D6D002B81 for ; Fri, 8 Nov 2013 19:07:28 +0000 (UTC) Received: by mail-vb0-f42.google.com with SMTP id p14so1714275vbm.29 for ; Fri, 08 Nov 2013 11:07:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=B4a4taKU4NyS4nsHvMBQ6SW4Bt6N3WZHfS9ijkHLF8k=; b=WxqQGy/hEUmO/mLly3jml+iXrli/4c2lDai5O0Ni8uo6+fnwBP9Vh+Y/aTih84bx82 gLnj9qS0UJov6cnXlGK3YUNIh4ovC6UOQG6XRV8PU3F23qbscl6og8rmengO5HhI4ADz NKumlNL98JmYBp17iuI41VDsVahv2Y3ybY5qjwR+BWm4ubBa6sq4M7ol2l6E/HfLteEG TNDbqCk5XPKoNxyGbqflh+rmRHfJp7B82rJKtV56C+kJechsMKMYiB6nRcVy7vyLZPxb CtvyWVZ938zRyoxlme+hbh/qBI9MBFC+AdGKpcf9Y6psBzPfsQ63jEelklzq19p/bEPF Ua/w== MIME-Version: 1.0 X-Received: by 10.221.27.73 with SMTP id rp9mr3442890vcb.29.1383937647837; Fri, 08 Nov 2013 11:07:27 -0800 (PST) Sender: artemb@gmail.com Received: by 10.221.9.2 with HTTP; Fri, 8 Nov 2013 11:07:27 -0800 (PST) In-Reply-To: References: Date: Fri, 8 Nov 2013 11:07:27 -0800 X-Google-Sender-Auth: -VdnxGoQYivfnrpFQ0xgIPb3RcM Message-ID: Subject: Re: Ghost ZFS pool prevents mounting root fs From: Artem Belevich To: Benjamin Lutz Content-Type: text/plain; charset=ISO-8859-1 Cc: freebsd-fs , Andre Seidelt , Dirk Hoefle X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Nov 2013 19:07:29 -0000 On Fri, Nov 8, 2013 at 2:53 AM, Benjamin Lutz wrote: > Hello, > > I have a server here that after trying to reboot during the 9.2 update > process refuses to mount the root file system, which is a ZFS (tank). > > The error message given is: > Trying to mount root from zfs:tank []... > Mounting from zfs:tank failed with error 5. > > Adding a mit more verbosity by setting vfs.zfs.debug=1 gives one > additional crucial bit of information that probably explains why, it tries > to find the disk /dev/label/disk7, but no such disk exists. I ran into the same issue recently. http://lists.freebsd.org/pipermail/freebsd-fs/2013-November/018496.html > Can you tell me how to resolve the situation, i.e. how to make the ghost > pool go away? I'd rather not recreate the pool or move the data to another > system, since it's around 16TB and would take forever. It should be doable, but usual "YMMV", "proceed at your own risk", "here, there be dragons" warnings apply. [snip] > root@:~ # zdb -l /dev/da1 > -------------------------------------------- > LABEL 0 > -------------------------------------------- > failed to unpack label 0 > -------------------------------------------- > LABEL 1 > -------------------------------------------- > failed to unpack label 1 > -------------------------------------------- > LABEL 2 > -------------------------------------------- > version: 28 > name: 'tank' > state: 2 > txg: 61 > pool_guid: 4570073208211798611 > hostid: 1638041647 > hostname: 'blackhole' > top_guid: 5554077360160676751 > guid: 11488943812765429059 > vdev_children: 1 > vdev_tree: > type: 'raidz' > id: 0 > guid: 5554077360160676751 > nparity: 3 > metaslab_array: 30 > metaslab_shift: 37 > ashift: 12 > asize: 16003153002496 > is_log: 0 > create_txg: 4 > children[0]: > type: 'disk' > id: 0 > guid: 7103686668495146668 > path: '/dev/label/disk0' > phys_path: '/dev/label/disk0' > whole_disk: 1 > create_txg: 4 The ghost labels are at the end of /dev/da1 (and, probably all other drives that used to be part of that pool). In my case I ended up manually zeroing out first sector of offending labels. ZFS places two copies of the labels at 512K and 256K from the end of the pool slice. See ZFS on-disk specification here: http://maczfs.googlecode.com/files/ZFSOnDiskFormat.pdf It's fairly easy to find with: #dd if=/dev/da1 bs=1m iseek={disk size in mb -1} count=1 | hexdump -C | grep version Once you know where exactly it is, deleting it is simple. Watch out for dd typos or, perhaps use some sort of disk editor to make sure you're not overwriting wrong data. It's a fairly risky operation as you have to make sure you don't nuke anything else by accident. If the disk portion with the labels is currently unallocated, then things are relatively safe. If it's currently used, then you'll need to figure out whether it's safe to overwrite those labels directly or find another way to do it. I.e. if the area with the labels is currently used for some other filesystem, you may be able to get rid of the label by filling up that filesystem with data which would hopefully overwrite labels with something else. If the labels are within the area that is part of the current pool, you are probably safe as it's either in unused area or it's not been used by ZFS yet. In my case the ghost labels were in the neighbourhood of the labels of the current pool and nuking them produced zero errors on scrub. Once you've established that manual label nuking is what you want, here's the recipe: * Make sure risking your data is *really* worth it. Consider erasing drives one-by-one and let raidz repair the pool if you have any doubts. Now that that's out of the way, let's nuke them. * offline one of the drives with the ghost labels or do the operation on an unmounted pool (I've booted from MFSBSD CD). Make sure that it is the right sector you're writing to (i.e. it's the label with wrong disks): * dd if=/dev/daN bs=512 iseek= count=10 | hexdump -C Nuke the ghost! Note: you only want to write *one* sector. Make sure you don't forget to edit count if you use shell history and reuse the commend above. * dd if=/dev/zero of=/dev/daN bs=512 oseek={sector that has 'version' word in the label} count=1 * make sure "zdb -l /dev/daN" no longer shows ghost label. * online the disk * scrub the pool. In case you made a mistake and wrote to the wrong place that may save your pool. I did the scrub only after I've erased label on the first drive to make sure it didn't damage anything vital. * repeat for all other disks with ghost labels. * run the scrub after all ghost labels have been erased. Just in case. Good luck. --Artem