Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 14 Jan 2014 17:31:26 +0100
From:      Ulysse 31 <ulysse31@gmail.com>
To:        freebsd-fs@freebsd.org
Subject:   zpool import taking weeks ...
Message-ID:  <CAFSDvD2hLNN_CupeVKu02bLLUS1EFuTdZZw5Z9xoiNFZ0p4_fg@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Hi all,

Hope that someone may have some advices or tips to try solve my actual problem :
A few weeks ago, our backup server, a freebsd 9.1 with zfs v28 on a
raidz of 20TB (15 used) 64Gb RAM, had a serious hook up : a periodical
sync (zfs send/recv) between a distant server and the backup server
went wrong, which leaded to a "zfs recv" runing on the server and a
"zfs rollback" on the same dataset. which lead to hang partially the
machine, so machine was rebooted.
At boot time, the machine took long time to import, and out of swap
messages had come on the screen after 48 hours.
We decided to reboot again and boot on a livecd (zfsguru), on which we
add a usb drive as swap storage, and then when have launched the
import of the pool with :

zpool import -N -F <poolname>

on the first 48h hours the import took about all memory plus 779Mo of
swap, since it is a zfsguru cd, i only have two terminals available
(ALT+F1 and ALT+F2). The import was running on the second and on the
first I could monitor the mem/cpu usage.
On the first terminal, for some reason, when i launch top, it was
quitting right after the first screen refresh. so I was firstly
checking the machine with "top | head -n 24".
After some days, I just write the following command on the first
terminal "while true; do top | head -n 24; sleep 5; done". And it was
working ... for 2 mins, after that, the terminal hung ...
I can still check the import is running by using "CTRL+T" on the
second terminal where zpool import is running, but the infos are not
really helpfull unless telling that it still running.
I get something like :

load: 0.00 cmd: zpool 20299 [tx->tx_sync_done_cv)] 612062,13r 0.04u 0.29s 0% 48k

sometimes, the "load: 0.00" goes arround 0.20 then comes back to 0.00
most of the time.
it has been now running for more than a week, from what i read arround
the only thing i can do is wait ... if someone as tips or ideas I
would be really happy ^^'
The storage is using one zpool with multiple dataset with dedup on (i
know it EATS RAM :s )
On the live cd dmesg i could read at begining of the import something
like : "Warning: can't open objset " followed by the dataset name that
was crashed.
I don't mind loosing this particular dataset, but some other are ...
well, important.

It is the first time I use zfsguru livecd, so at boot, i set the root
password, in order to log into via ssh if needed, after loosing one of
the terminal i realize that zfsguru is configured to use only "ssh"
user to log into ssh, not root (stupid me ...).
Thanks all for your kind help.

Cheers.


-- 
Ulysse31



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAFSDvD2hLNN_CupeVKu02bLLUS1EFuTdZZw5Z9xoiNFZ0p4_fg>