From owner-freebsd-fs@FreeBSD.ORG Sun Feb 8 17:26:09 2009 Return-Path: Delivered-To: freebsd-fs@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 9226B106564A for ; Sun, 8 Feb 2009 17:26:09 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from cdptpa-omtalb.mail.rr.com (cdptpa-omtalb.mail.rr.com [75.180.132.121]) by mx1.freebsd.org (Postfix) with ESMTP id 52F368FC08 for ; Sun, 8 Feb 2009 17:26:09 +0000 (UTC) (envelope-from morganw@chemikals.org) Received: from shop.chemikals.org ([75.182.5.141]) by cdptpa-omta01.mail.rr.com with ESMTP id <20090208172608.DIAF6485.cdptpa-omta01.mail.rr.com@shop.chemikals.org>; Sun, 8 Feb 2009 17:26:08 +0000 Received: from localhost (morganw@localhost [127.0.0.1]) by shop.chemikals.org (8.14.3/8.14.3) with ESMTP id n18HQ7od027626; Sun, 8 Feb 2009 12:26:07 -0500 (EST) (envelope-from morganw@chemikals.org) Date: Sun, 8 Feb 2009 12:26:07 -0500 (EST) From: Wesley Morgan To: Dan Cojocar In-Reply-To: Message-ID: References: User-Agent: Alpine 2.00 (BSF 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-fs@freebsd.org Subject: Re: zfs replace disk has failed X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Feb 2009 17:26:10 -0000 On Sun, 8 Feb 2009, Dan Cojocar wrote: > On Sun, Feb 8, 2009 at 12:04 AM, Wesley Morgan wrote: >> On Tue, 3 Feb 2009, Dan Cojocar wrote: >> >>> Hello all, >>> In a mirror(ad1,ad2) configuration one of my disk(ad1) had failed, >>> after replacing the failed disk with a new one using: >>> zpool replace tank ad1 >>> I have noticed that the replace is taking too long and that the system >>> is not responding, after restart the new disk was not recognized any >>> more in bios :(, I have tested also in another box and the disk was >>> not recognized there too. >>> I have installed a new one on the same location (ad1 I think). Then >>> the zpool status has reported something like this (this is from memory >>> because I have made many changes back then, I don't remember exactly >>> if the online disk was ad1 or ad2): >>> >>> zpool status >>> pool: tank >>> state: DEGRADED >>> scrub: none requested >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> tank DEGRADED 0 0 0 >>> mirror DEGRADED 0 0 0 >>> replacing UNAVAIL 0 387 0 >>> insufficient replicas >>> 10193841952954445329 REMOVED 0 0 0 was >>> /dev/ad1/old >>> 9318348042598806923 FAULTED 0 0 0 was /dev/ad1 >>> ad2 ONLINE 0 0 0 >>> At this stage I was thinking that if I will attach the new disk (ad1) >>> to the mirror I will get sufficient replicas to detach >>> 9318348042598806923 (this one was the disk that has failed the second >>> time), so I did an attach, after the resilvering process has completed >>> with success, I had: >>> zpool status >>> pool: tank >>> state: DEGRADED >>> scrub: none requested >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> tank DEGRADED 0 0 0 >>> mirror DEGRADED 0 0 0 >>> replacing UNAVAIL 0 387 0 >>> insufficient replicas >>> 10193841952954445329 REMOVED 0 0 0 was >>> /dev/ad1/old >>> 9318348042598806923 FAULTED 0 0 0 was /dev/ad1 >>> ad2 ONLINE 0 0 0 >>> ad1 ONLINE 0 0 0 >>> And I'm not able to detach 9318348042598806923 :(, and another bad >>> news is that if I try to access something under /tank the operation is >>> hanging, eg: if I do a ls /tank is freezing and if I do in another >>> console: zpool status which was working before ls, now it's freezing >>> too. >>> What should I do next? >>> Thanks, >>> Dan >> >> ZFS seems to fall over on itself if a disk replacement is interrupted and >> the replacement drive goes missing. >> >> By attaching the disk, you now have a 3-way mirror. The two possibilties for >> you would be to roll the array back to a previous txg, which I'm not at all >> sure would work, or to create a fake device the same size as the array >> devices and put a label on it that emulates the missing device, and you can >> then cancel the replacement. Once the replacement is cancelled, you should >> be able to remove the nonexistent device. Note, that the labels are all >> checksummed with sha256 so it's not a simple hex edit (unless you can >> calculate checksums by hand also!). >> >> If you send me the first 512k of either ad1 or ad2 (off-list of course), I >> can alter the labels to be the missing guids, and you can use md devices and >> sparse files to fool zpool. >> > > Hello Wesley, > This was a production server so I had to restore the mirror from the backup. > Can you explain a bit how can someone alter the labels of a disk in a pool? > Thanks, > Dan > As far as I know there is no tool available to interactively edit a label, although since the source code that defines the labels and the data within is available it should be possible to write. For devices in the same pool, they should all have nearly identical labels, differing only in the actual guid for the device itself. In my situation, I simply altered the guid with a hex editor and borrowed the zfs sha256 code to write the correct checksum to the label and using gvirstor (md probably would have worked as well) was able to cancel the failed replacement.