FreeBSD Mail Archives

Date:      Mon, 27 Dec 2010 00:04:17 -0500
From:      jhell <jhell@DataIX.net>
To:        Jean-Yves Avenard <jyavenard@gmail.com>
Cc:        freebsd-stable@freebsd.org
Subject:   Re: New ZFSv28 patchset for 8-STABLE: Kernel Panic
Message-ID:  <4D181E51.30401@DataIX.net>
In-Reply-To: <AANLkTimNv1%2BfL2KUrxLoTO2oQ=ziOO6raWT5TMZJkH4f@mail.gmail.com>
References:  <AANLkTinxvU_QuAd6SG1hig7-YeC8tCdwAmwgL1AXfHNN@mail.gmail.com> <AANLkTimNv1%2BfL2KUrxLoTO2oQ=ziOO6raWT5TMZJkH4f@mail.gmail.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12/26/2010 23:17, Jean-Yves Avenard wrote:
> Responding to myself again :P
> 
> On 27 December 2010 13:28, Jean-Yves Avenard <jyavenard@gmail.com> wrote:
>> tried to force a zpool import
>>
>> got a kernel panic:
>> panic: solaris assert: weight >= space && weight <= 2 * space, file:
>> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c,
>> line: 793
>>
>> cpuid = 5
>> KDB: stack backtrace
>> #0: 0xffffffffff805f64be at kdb_backtrace
>> #1 ..  panic+0x187
>> #2 .. metaslab_weight+0xe1
>> #3: metaslab_sync_done+0x21e
>> #4: vdev_sync_done
>> #5: spa_sync+0x6a2
>> #6 txg_sync_thread+0x147
>> #7: fork_exit+0x118
>> #8: fork_trampoline+0xe
>>
>> uptime 2m25s..
>>
> 
> Command used to import in FreeBSD was:
> zpool import -fF -R / pool
> which told me that zil was missing, and to use -m
> 
> I booted openindiana (which is the only distribution I could ifnd with
> a live CD supporting zpool v28)
> 
> Doing a zpool import actually made it show that the pool had
> successfully been repaired by the command above.
> It did think that the pool was in use (and it was, as I didn't do a
> zpool export).
> 
> So I run zpool import -f pool in openindiana, and luckily, all my
> files were there. Not sure if anything was lost...
> 
> in openindiana, I then ran zpool export and rebooted into FreeBSD.
> 
> I ran zpool import there, and got the same original behaviour of a
> zpool import hanging, I can't sigbreak it nothing. Only left with the
> option of rebooting.
> 
> Back into openindiana, tried to remove the log drive, but no luck.
> Always end up with the message:
> cannot remove log: no such device in pool
> 
> Googling that error seems to be a common issue when trying to remove a
> ZIL but while that message is displayed, the log drive is actually
> removed.
> Not in my case..
> 
> So I tried something brave:
> In Open Indiana
> zpool export pool
> 
> rebooted the PC, disconnected the SSD drive I had use and rebooted
> into openindiana
> ran zpool import -fF -R / pool (complained that log device was
> missing) and again zpool import -fF -m -R / pool
> 
> zfs status showed that logs device being unavailable this time.
> 
> ran zpool remove pool log hex_number_showing_in_place
> 
> It showed the error "cannot remove log: no such device in pool"
> but zpool status showed that everything was allright
> 
> zpool export pool , then reboot into FreeBSD
> 
> zpool import this time didn't hang and successfully imported my pool.
> All data seems to be there.
> 
> 
> Summary: v28 is still buggy when it comes to removing the log
> device... And once something is screwed, zpool utility becomes
> hopeless as it hangs.
> 
> So better have a OpenIndiana live CD to repair things :(
> 
> But I won't be trying to remove the log device for a long time ! at
> least the data can be recovered when it happens..
> 
> Could it be that this is related to the v28 patch I used
> (http://people.freebsd.org/~mm/patches/zfs/v28/stable-8-zfsv28-20101223-nopython.patch.xz
> and should have stuck to the standard one).
> 

Before anything else can you: (in FreeBSD)

1) Set vfs.zfs.recover=1 at the loader prompt (OK set vfs.zfs.recover=1)
2) Boot into single user mode without opensolaris.ko and zfs.ko loaded
3) ( mount -w / ) to make sure you can remove and also write new
zpool.cache as needed.
3) Remove /boot/zfs/zpool.cache
4) kldload both zfs and opensolaris i.e. ( kldload zfs ) should do the trick
5) verify that vfs.zfs.recover=1 is set then ( zpool import pool )
6) Give it a little bit monitor activity using Ctrl+T to see activity.

You should have your pool back to a working condition after this. The
reason why oi_127 can't work with your pool is because it cannot see
FreeBSD generic labels. The only way to work around this for oi_127
would be to either point it directly at the replacing device or to use
actual slices or partitions for your slogs and other such devices.

Use adaNsN or gpt or gptid for working with your pool if you plan on
using other OS's for recovery effects.


Regards,

- -- 

 jhell,v
-----BEGIN PGP SIGNATURE-----

iQEcBAEBAgAGBQJNGB5QAAoJEJBXh4mJ2FR+rUAH/1HhzfnDI1jTICrA2Oiwyk12
BLXac0HoTY+NVUrdieMUWPh781oiB0eOuzjnOprev1D2uTqrmKvivnWdzuT/5Kfi
vWSSnIqWiNbtvA5ocgWs7IPtcaD5pZS06oToihvLlsEiRyYXTSh2XD7JOsLbQMNb
uKTfAvGI/XnNX0OY3RNI+OOa031GfpdHEWon8oi5aFBYdsDsv3Wn8Z45qCp8yfI+
WZlI+P+uunrmfgZdSzDbpAxeByhTB+8ntnB6QC4d0GRXKwqTVrFmIw5yuuqRAIf8
oCJYDhH6AUi+cxAGDExhLz2e75mEZNHAqB2nkxTaWbwL/rGjBnVidNm1aj7WnWw=
=FlmB
-----END PGP SIGNATURE-----

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D181E51.30401>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation