Date: Mon, 30 Mar 2020 20:06:56 +0200 From: Lukasz <FreeBSD@chroot.pl> To: freebsd-questions@freebsd.org Subject: Re: replace disk in zpool - solved Message-ID: <bfdac41d-ec4c-a965-5aa9-fd2da46c21ee@chroot.pl> In-Reply-To: <20200325081814.GK35528@mithril.foucry.net> References: <d329c84a-8777-1eca-787c-dad9e0eae752@chroot.pl> <18a94704-5411-3b44-a525-2ae50121a467@holgerdanske.com> <f6297dfe-e0c4-12ef-523c-1944a9c735ff@chroot.pl> <4a8d409e-ecac-77c8-3ad9-025aefdfb4ef@holgerdanske.com> <20200325081814.GK35528@mithril.foucry.net>
next in thread | previous in thread | raw e-mail | index | archive | help
Hello, this behavior was due to errors in zpool. Regards, Lukasz On 3/25/20 09:18, Jacques Foucry via freebsd-questions wrote: > Le mardi 24 mars 2020 à 16:47:10 (-0700), David Christensen à écrit: >> On 2020-03-24 14:15, Lukasz wrote: >>> Ohh… I forgot mention: >>> it's 12.1-p3 >>> >>> # zpool status -v mypool >>> pool: mypool >>> state: DEGRADED >>> status: One or more devices has experienced an error resulting in data >>> corruption. Applications may be affected. >>> action: Restore the file in question if possible. Otherwise restore the >>> entire pool from backup. >>> see: http://illumos.org/msg/ZFS-8000-8A >>> scan: resilvered 180G in 0 days 16:00:55 with 2 errors on Sun Mar 22 >>> 05:18:46 2020 >>> config: >>> >>> NAME STATE READ WRITE CKSUM >>> mypool DEGRADED 0 0 2 >>> raidz1-0 DEGRADED 0 0 4 >>> diskid/DISK-WD-WMC1F0521131 ONLINE 0 0 0 >>> replacing-1 DEGRADED 0 0 0 >>> 15838717335844820448 UNAVAIL 0 0 0 was /dev/diskid/DISK-WD-WCC130964640 >>> diskid/DISK-K4JG5D2B ONLINE 0 0 0 >>> ada6 ONLINE 0 0 0 >>> ada1 ONLINE 0 0 0 >>> diskid/DISK-WD-WCC130650055 ONLINE 0 0 0 >>> >>> errors: Permanent errors have been detected in the following files: >>> mypool/XXXXXXXXXXXX >>> >>> Yes, I did exacly as you wrote - removed the failed drive, installed a replacement drive, and issued a 'zpool replace' command. >>> I tried this way to: >>> I disabled running services in that pool, unmounted and mounted it again. Even I exported/imported that pool. >>> It has no readonly property. >>> Of course I have a backup. >> >> >> My guess is that resilvering is stuck because ZFS has encountered data >> corruption. This could be caused by drive(s), cable(s), and/or data port(s) >> (motherboard or expansion card). >> >> >> What was the failure mode of the bad drive? Did you test it in any other >> machines? >> >> >> Are the any items of concern in the SMART reports for the current set of >> drives? Please post anything that looks questionable. >> >> >> Unplug and plug all of your drive power and data cables. Make sure they >> seat well. If unsure about a data cable, replace it with a new, locking >> cable. I have experienced too many problems with red SATA cables. Few, if >> any, are marked with their rated speed (I did mark some StarTech SATA III >> cables). So, I stocked up on various lengths and configurations of Cable >> Matters SATA III cables. They are black, marked "6G", and have locking >> connectors. Now, whenever I am in a system case, I replace most every red >> SATA cable just to be safe. >> >> >> I appears that you have Western Digital hard drives. Download Data >> Lifeguard Diagnostic (DLG) for DOS, burn it to a USB flash drive, boot it, >> and test all of your drives. Please post the results: >> >> https://support.wdc.com/downloads.aspx?p=2 > > If you permit an advice, ALWAYS (when it's possible) buy and use disks from > different brand (mix seagate, WD, etc..) in order to avoid same series and same > MTBF. > > I know this to late in this case, but keep this in mind. > > I know this will not help in this case, please excuse my intervention if it's > inappropriate. >
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?bfdac41d-ec4c-a965-5aa9-fd2da46c21ee>