Date: Tue, 10 Apr 2012 18:49:05 -0700 From: Rumen Telbizov <telbizov@gmail.com> To: Andriy Gapon <avg@freebsd.org> Cc: freebsd-stable@freebsd.org Subject: Re: ZFS: can't read MOS Message-ID: <CAENR%2B_V1yzmMBrPCjCQf1L6u9U%2B1U=qEzBN5GMbC1PLe7oLk2Q@mail.gmail.com> In-Reply-To: <4F841AA8.3030602@FreeBSD.org> References: <CAENR%2B_X6gb5TB01i3FTfq_zD=RyFUGfLAWwA56SNm6Gqf_49iw@mail.gmail.com> <4F841AA8.3030602@FreeBSD.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Thanks for your answers guys, Daniel: I know it's uncommon to use ZFS on a hardware raid block device but with this machine I tried pretty much every other option before I got here. Initially I was using an LSI 9211-8i HBA and raidz2 but after I lost more than half of the disks a couple of times and I had to rebuild the machine I gave up. Of course I realize it was probably a hardware issue, most likely due to some incompatibility between the sas expander and the HBA (both LSI though). Even before that I was using another LSI RAID controller which exported every disk into an individual raid0 of 1 disk and then raidz2 on top of it. The problem with this setup was that performance sucked. I guess it might have had something to do with the hardware raid since I've seen other cards (3ware) behaving poorly in this kind of setup. And no, those LSI card don't seem to have any real pass-through mode. The JBOD thing is really just a raid0 of 1 disk. That's pretty much how I got to the point of where I am, which turned out to be, ironically, the most stable on that specific box! Until the power outage. So yeah I know what you're advising me but unfortunately that's where I am coming from. Having said that I do have another 2 machines which are single chassis 36 disks and both use 9211-8i HBAs with a straight raidz2 and they've been perfectly fine. So it's this one box that I had to resort to hardware raid. Nevertheless thanks for the advise. I think I wouldn't go down that road since that maneuver would require to drop some snapshots (not entirely impossible) and more importantly I'll have to leave the array with no redundancy which I fear most since I have 48 disks! I'll need to take as many disks as possible -- which is 6 x 2 == 12 and this will leave all my current 6 groups without any redundancy whatsoever. What I do intend to do in case anything else fails - is simply put 2 new disks (inside the chassis, maybe old SSDs) and move the bootfs zfs filesystem on them, leave the main pool as simple storage and forget about booting from it. Way easier, safer and quicker. That's my last resort though. Andriy: Thanks a lot for pointing out that script. I saw it somewhere during my search but my 8.2-STABLE doesn't have it. I took the files from the svn and managed to compile and ran the script. I'll finish with it tomorrow and I'll update the list with my findings. Having a better understanding on *exactly* what is causing the problem is what I want to have first and foremost. Hopefully this script will help me. Thanks to both of you guys. Cheers, Rumen Telbizov On Tue, Apr 10, 2012 at 4:34 AM, Andriy Gapon <avg@freebsd.org> wrote: > on 09/04/2012 21:50 Rumen Telbizov said the following: > > Hello everyone, > > > > I have a ZFS FreeBSD 8.2-STABLE (Aug 30, 2011) that I am having issues > with > > and might use some help. > > > > In a nutshell, this machine has been running fine for about a year and a > > half but after a recent power > > outage (complete colo blackout) I can't boot of the ZFS pool any more. > > Here's the error I get (attached screenshot as well): > > > > ZFS: i/o error - all block copies unavailable > > ZFS: can't read MOS > > ZFS: unexpected object set type 0 > > ZFS: unexpected object set type 0 > > > > FreeBSD/x86 boot > > Default: zroot:/boot/kernel/kernel > > boot: ZFS: unexpected object set type 0 > > > > I've been searching the net high and low for an actual solution but all > the > > threads end up nowhere. > > I hope I can get some clue here. Thanks in advance. > > Not sure if the following could be of any help to you but > ${SRC}/tools/tools/zfsboottest utility can help diagnosing and debugging > such > issues from userland (without requiring a reboot). > > See also a small nitpick below. > > > Here's the relevant hardware configuration of this box (serves as a > backup > > box). > > > > - SuperMicro 4U + another 4U totalling 48 x 2TB disks > > - Hardware raid LSI 9261-8i holding both shelves giving 1 mfid0 device > > to the OS > > - Hardware raid 60 -- 6 x 8 raid6 groups > > - ZFS with gptzfsboot installed on the "single" mfid0 device. > Partition > > table is: > > > > [root@mfsbsd /zroot/etc]# gpart show -l > > => 34 140554616765 mfid0 GPT (65T) > > 34 128 1 (null) (64k) > > 162 33554432 2 swap (16G) > > 33554594 140521062205 3 zroot (65T) > > > > > > > > - boot device is: vfs.root.mountfrom="zfs:zroot" (as per loader.conf) > > - zpool status is: > > > > [root@mfsbsd /zroot/etc]# zpool status > > pool: zroot > > state: ONLINE > > scan: scrub canceled on Mon Apr 9 09:48:14 2012 > > config: > > > > NAME STATE READ WRITE CKSUM > > zroot ONLINE 0 0 0 > > mfid0p3 ONLINE 0 0 0 > > > > errors: No known data errors > > > > > > > > - zpool get all: > > > > [root@mfsbsd /zroot/etc]# zpool get all zroot > > NAME PROPERTY VALUE SOURCE > > zroot size 65T - > > zroot capacity 36% - > > zroot altroot - default > > zroot health ONLINE - > > zroot guid 3339338746696340707 default > > zroot version 28 default > > *zroot bootfs zroot local* > > zroot delegation on default > > zroot autoreplace off default > > zroot cachefile - default > > zroot failmode wait default > > zroot listsnapshots on local > > zroot autoexpand off default > > zroot dedupditto 0 default > > zroot dedupratio 1.00x - > > zroot free 41.2T - > > zroot allocated 23.8T - > > zroot readonly off - > > > > > > Here's what happened chronologically: > > > > - Savvis Toronto blacked out completely for 31 minutes > > - After power was restored this machine came up with the above error > > - I managed to PXE boot into mfsbsd successfully and managed to import > > the pool and access actual data/snapshots - no problem > > - Shortly after another reboot the hardware raid controller complained > > that it has lost > > it's configuration and now sees only half of the disks as foreign good > > and the > > rest as foreign bad. BIOS didn't see any boot device. > > - Spent some time on the phone with LSI and managed to restore the > > hardware RAID > > by basically removing any and all configuration, making disks > > unconfigured good > > and recreating the array in exactly the same way as I created it in > the > > beginning BUT > > with the important exception that I did NOT initialize the array. > > - After this I was back to square one where I could see all the data > > without any loss > > (via mfsbsd) but cannot boot of the volume any more. > > - First thing I tried was to restore the boot loader without any luck: > > gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 mfid0p1 > > - Then out of desperation, took zfsboot, zfsloader, gptzfsboot from > > 9.0-RELEASE and replaced them in /boot, > > reinitialized again - no luck > > - Currently running zdb -ccv zroot to check for any corruptions - I am > > afraid this will take forever since I have *23.8T* used space. No > errors > > yet > > - One thing I did notice is that zdb zroot returned the metaslab > > information line by line very slowly (10-15 seconds a line). I don't > know > > if it's related. > > - Another thing I tried (saw that in a thread) without any difference > > whatsoever was: > > > > # cd src/sys/boot/i386/zfsboot > > # make clean; make cleandir > > # make obj ; make depend ; make > > # cd i386/loader > > You probably wanted to do this in i386/zfsloader > > > # make install > > # cd /usr/src/sys/boot/i386/zfsboot > > # make install > > # sysctl kern.geom.debugflags=16 > > # dd if=/boot/zfsboot of=/dev/da0 count=1 > > # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024 > > # reboot > > > > > > At this point I am contemplating how to evacuate all the data from there > or > > better yet put some USB flash to boot from. > > I could provide further details/execute commands if needed. Any help > would > > be appreciated. > > > > > > -- > Andriy Gapon > -- Rumen Telbizov http://telbizov.com
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAENR%2B_V1yzmMBrPCjCQf1L6u9U%2B1U=qEzBN5GMbC1PLe7oLk2Q>