Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 3 May 2020 16:05:25 +0200
From:      "Ireneusz Pluta/wp.pl" <ipluta@wp.pl>
To:        =?UTF-8?Q?Trond_Endrest=c3=b8l?= <trond.endrestol@ximalas.info>
Cc:        freebsd-questions@freebsd.org
Subject:   Re: How to get rid of an unavailable pool?
Message-ID:  <092e9379-37cf-f839-e4e4-eeb1e8821f1a@wp.pl>
In-Reply-To: <alpine.BSF.2.22.395.2005020959190.91211@enterprise.ximalas.info>
References:  <32264a1f-3bcf-9d74-603d-c201bffd256c@wp.pl> <alpine.BSF.2.22.395.2005020959190.91211@enterprise.ximalas.info>

next in thread | previous in thread | raw e-mail | index | archive | help
W dniu 2020-05-02 o 10:03, Trond Endrestøl pisze:
> On Sat, 2 May 2020 06:15+0200, Ireneusz Pluta wrote:
>
>> Hi group,
>>
>> (Sorry if this post appears twice. The first one, initially sent from another
>> email account, does not seem to appear.)
>>
>> I have (or rather had) a pool like this:
>>
>> $ sudo zpool status -v t
>>    pool: t
>>   state: UNAVAIL
>> status: One or more devices are faulted in response to IO failures.
>> action: Make sure the affected devices are connected, then run 'zpool clear'.
>>     see: http://illumos.org/msg/ZFS-8000-HC
>>    scan: none requested
>> config:
>>
>>          NAME                     STATE     READ WRITE CKSUM
>>          t                        UNAVAIL      0     0 0
>>            mirror-0               UNAVAIL      0     0 0
>>              4304281762335857859  REMOVED      0     0 0  was /dev/da5
>>              1909766900844089131  REMOVED      0     0 0  was /dev/da10
>>
>> errors: Permanent errors have been detected in the following files:
>>
>>          <metadata>:<0x0>
>>          <metadata>:<0x1b>
>>          t:<0x0>
>>
>> That was a temporary test pool. I forgot to destroy  or at least export the
>> pool before pulling these da5 and da10 drives out of the drivebay of the
>> server. Now it can't be exported or destroyed, the respective zpool operations
>> hust hang. How to get rid now of this pool, preferably without reboot? The da5
>> and da10 are no longer available to be put back, as they have been already
>> moved elsewhere, and are now part of another pool.
>>
>> I guess the pool got stuck at the time of running
>> /etc/periodic/security/100.chksetuid, when find operation within it tried to
>> traverse into the mountpoint of the pool.
>>
>> The system is FreeBSD 11.2.
>>
>> Thanks
>>
>> Irek
> The pool might still be listed in /boot/zfs/zpool.cache. The only way
> I can think of to get rid of the old pool, is to delete this file and
> reboot. If you have more pools than your root pool, you should reboot
> to singleuser mode, mount the root fs read-write, import the
> remaining pools, and either exit the SUS shell or reboot.

Trond,

thank you for your advice.

Yes, that state was unrecoverable without reboot. Additionally I found this little thread 
https://www.databaseusers.com/article/5971869/Cannot+export+%27backup%27%3A+pool+I+O+is+currently+suspended, 
whose last post helped me a lot with understanding what was going on under the hood, and why.

So I followed the procedure carefully, taking special care of first stopping important applications 
and unmounting other big and valuable datasets. Forced hard reset was necessary, the reboot command 
just froze. However, there was one exception: I skipped deleting  /boot/zfs/zpool.cache, to avoid 
falling into single user mode and importing my pools manually (I felt very uncomfortable going to do 
that remotely, with that crappy IPMIView console redirection). The system booted cleanly with all 
pools imported. The UNAVAIL pool got imported too, however, it did not get mounted, so there was no 
chance of any I/O attempt to it. The first thing I did after login was: `zpool destroy t`, which 
worked cleanly.

Prior to doing all that, I reproduced that state and excercised the procedure on a virtual machine.

Thanks again

Irek




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?092e9379-37cf-f839-e4e4-eeb1e8821f1a>