Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 22 Aug 2020 04:06:03 -0700
From:      David Christensen <dpchrist@holgerdanske.com>
To:        freebsd-questions@freebsd.org
Subject:   Re: adding disk to zfs mirror after removal of disk
Message-ID:  <9274f688-2897-8d0b-d799-100316684b06@holgerdanske.com>
In-Reply-To: <20200822050431.GA17289@bastion.zyxst.net>
References:  <20200821230206.GA56267@bastion.zyxst.net> <ab5857f4-cdc7-b589-1b39-5d2fb550ad5b@holgerdanske.com> <20200822050431.GA17289@bastion.zyxst.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On 2020-08-21 16:02, tech-lists wrote:
 > I seem to have broken the zpool on a mirror setup. One of the disks 
showed
 > errors, so detached it and ran a dd rw over the entire disk to make the
 > hardware remap bad blocks. I then attached it to the pool and it
 > resilvered.
 > Problem is now, the pool doesn't show its pooltype ie mirror-0, there's
 > nothing where mirror-0 was, just the two disks.
 >
 > How to fix?


On 2020-08-21 22:04, tech-lists wrote:
> Hi, thanks for looking at this

YW.  :-)


> # zpool list wd
> NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH
> ALTROOT
> wd    7.25T   490G  6.77T        -         -     0%     6%  1.00x  
> ONLINE  -
> 
> # zpool status wd
>    pool: wd
>   state: ONLINE
>    scan: scrub repaired 0 in 0 days 01:00:27 with 0 errors on Fri Aug 21
> 11:38:48 2020
> config:
> 
>          NAME                           STATE     READ WRITE CKSUM
>          wd                             ONLINE       0     0     0
>            diskid/DISK-WD-WCC131097443  ONLINE       0     0     0
>            ada3                         ONLINE       0     0     0
> 
> errors: No known data errors
> 
> The disks are 2 x western digital black 4TB. The disk that was removed from
> the mirror is ada3. The didkid/string is aka ada2. Ideally i'd like its 
> name
> changed back from that string to ada2 as well.


Thanks for the information.  Trouble-shooting based upon facts greatly 
improves the chances of success.  :-)


I should have said this before -- first, back up all of your data.


You do not have the console sessions from when you created the pool and 
from when you made changes to the pool and/or drive (?).  Thankfully, I 
think we can fix your pool without it.


I use a version control system (CVS) and create a project for every 
computer.  Inside each project, I maintain a plain text "administrative 
log file" where I write notes to myself of what I'm doing, where, when, 
and why.  I copy and paste important console sessions into the log 
files.  I also check-in any system configuration files I create or 
modify.  All of this effort provides invaluable information for 
historical research and future use.  If you're not comfortable with a 
version control system, putting stuff on a USB drive (and backing it up 
frequently) is better than nothing.


Please run this command (and post the console session):

     # freebsd-version ; uname -a


As for fixing your pool, research all of the following commands and make 
sure you understand everything before you start.  Save (and post) your 
console sessions.  You do have backups, right?


The first step is to remove the top-level virtual data device ada3 (see 
[1]):

     # zpool remove wd ada3


Watch the pool status for removal progress:

     # zpool status wd


Once ada3 has been removed from the pool, I would check it thoroughly 
with WD Data Lifeguard Diagnostics (see [2]).  Unfortunately, the 
current version only runs on Windows.  Run all available tests.  Then 
zero the entire drive.  Then run all the tests again.  If everything 
passes, you're good.  If anything fails, look for a "fix" option, run 
that, and do the tests, zero, and tests again.  Only proceed with this 
drive if and when all of the tests, zero, tests run without error.  If 
not, get a new drive.


If you do not have a Windows computer, you need one.


There is an older bootable DOS version of WD DLD that may or may not 
work for you (see [3]).


If you can't do WD DLD, fake it with dd(1) and smartctl(8).


The final step is to attach the wiped drive as a new device to the 
existing device inside the mirror (see [4]).  I don't know how you 
obtained the "diskid/..." name, but do the same on the wiped drive. 
Substitute this new name for $WIPED_DRIVE in the following command:

     # zpool attach wd diskid/DISK-WD-WCC131097443 $WIPED_DRIVE


Watch the pool status for re-silvering progress:

     # zpool status wd


David



References:


[1] https://docs.oracle.com/cd/E37838_01/html/E61017/remove-devices.html

[2] https://support.wdc.com/downloads.aspx?DL

[3] https://support.wdc.com/downloads.aspx?p=2

[4] https://docs.oracle.com/cd/E19253-01/819-5461/gcfhe/index.html


p.s. -- Take a look at this guide for ideas on ZFS naming.  The concepts 
of unique pool names, unique top-level dataset names, and datasets named 
per backup policies are very useful.  I have implemented the first, but 
have conflated the latter two (I am undecided if this was a good idea). 
Use it to formulate your game plan:

     https://b3n.org/zfs-hierarchy/





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?9274f688-2897-8d0b-d799-100316684b06>