Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 29 Jul 2008 09:24:57 -0700
From:      George Hartzell <hartzell@alerce.com>
To:        freebsd-fs@FreeBSD.org, freebsd-current@freebsd.org
Cc:        =?ISO-8859-1?Q?Beat_G=E4tzi?= <beat@chruetertee.ch>
Subject:   Re: ZFS patches. [Problem with root on zfs and upgrading]
Message-ID:  <18575.17497.248521.461931@almost.alerce.com>
In-Reply-To: <488E246D.2030508@chruetertee.ch>
References:  <20080727125413.GG1345@garage.freebsd.pl> <488E246D.2030508@chruetertee.ch>

next in thread | previous in thread | raw e-mail | index | archive | help
Beat G=E4tzi writes:
 > Hi,
 >=20
 > Pawel Jakub Dawidek wrote:
 > > The patch above contains the most recent ZFS version that could be=
 found
 > > in OpenSolaris as of today. Apart for large amount of new function=
ality,
 > > I belive there are many stability (and also performance) improveme=
nts
 > > compared to the version from the base system.
 >=20
 > Thanks for the great work!
 >=20
 > > Please test, test, test. If I get enough positive feedback, I may =
be
 > > able to squeeze it into 7.1-RELEASE, but this might be hard.
 >=20
 > I have a amd64 box with 8GB RAM running CURRENT-200806 snapshot. I g=
et
 > the latest version of the sources with csup, applied your patch and
 > build the world/kernel.
 > /usr/src and /usr/obj are located on a zfs file system. After "make
 > installkernel" and reboot into single user mode I had to start the z=
fs
 > file system but it failed:
 >=20
 > # fsck
 > # mount -a
 > # /etc/rc.d/hostid start
 > Setting hostuuid: ...
 > Setting hostid: ...
 > # /etc/rc.d/zfs start
 > lock order reversal:
 >  1st 0xffffff0004832620 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:205=
3
 >  2nd 0xffffffff80b09da0 kernel linker (kernel linker) @
 > /usr/src/sys/kern/kern_linker.c:693
 > KDB: stack backtrace:
 > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
 > witness_checkorder() at witness_checkorder+0x609
 > _sx_xlock() at _sx_xlock+0x52
 > linker_file_lookup_set() at linker_file_lookup_set+0xe1
 > linker_file_register_sysctls() at linker_file_register_sysctls+0x20
 > linker_load_module() at linker_load_module+0x919
 > linker_load_dependencies() at linker_load_dependencies+0x1bc
 > link_elf_load_file() at link_elf_load_file+0xa96
 > linker_load_module() at linker_load_module+0x8cf
 > kern_kldload() at kern_kldload+0xac
 > kldload() at kldload+0x84
 > syscall() at syscall+0x1bf
 > Xfast_syscall() at Xfast_syscall+0xab
 > --- syscall (304, FreeBSD ELF64, kldload), rip =3D 0x80068561c, rsp =
=3D
 > 0x7fffffffec88, rbp =3D 0 ---
 > This module (opensolaris) contains code covered by the
 > Common Development and Distribution License (CDDL)
 > see http://opensolaris.org/os/licensing/opensolaris_license/
 > WARNING: ZFS is considered to be an experimental feature in FreeBSD.=

 > ZFS filesystem version 11
 > ZFS storage pool version 11
 > internal error: out of memory
 > internal error: out of memory
 > internal error: out of memory
 > internal error: out of memory
 >=20
 > Running "zpool list" shows no available pool and the "internal error=
:
 > out of memory" error message.
 >=20
 > The same problem occurs in multi-user mode. loader.conf is set to:
 > vm.kmem_size_max=3D"2147483648"
 > vm.kmem_size=3D"2147483648"
 >=20
 > Increase/remove the kmem_size-values didn't change anything.
 >=20
 > To solve the problem I had to boot kernel.old and run make
 > installworld/mergemaster. After rebooting with the new kernel the po=
ol
 > was available again and everything work without a problem.
 >=20
 > Did I do something wrong when I upgraded the server?

I'm being bitten by the problem that bit Beat, but worse.

I'm running a root on zfs system, built using variations of Yarema's
tools (which do a great job of rounding up and automating all of the
little tips and tricks about putting your root on a zfs filesystem,
you should read and understand what they're doing though, you'll
probably need to adapt them a bit...
  [ http://yds.coolrat.org/zfsboot.shtml ]).

I moved a computer from -STABLE up to -CURRENT via csup and rebuilt
everything to convince myself that the upgrade went well.

Then I applied Pawel's patch (-p0 -E), and:

  make buildworld
  make buildkernel KERNCONF=3DBLUETOO
  make installkernel KERNCONF=3DBLUETOO

and rebooted.  I planned to drop down to single user and do the
mergemaster/installworld.

When I try to boot multi user things go south and it's clear that /usr
et al. is missing.

I can boot my new kernel single user and my root gets mounted from my
zpool, but none of my other zfs filesystems are mounted, and when I
try to run zfs list or zpool status I got the same out of memory
message that Beat sees.

The ZFS filesystem and pool are at version 11 (seen scrolling by on
the console).

I suspect that my newer kernel isn't cooperating with the older
userland utilities which prevents the filesystems from being mounted.

I tried to boot from kernel.old, but I end up at the mountroot prompt
and can't mount my root.  Presumably since my pool has been
automagically upgraded to version 11 I can no longer mount my root
using kernel.old, so Beat's end-run won't help me.

There's nothing I care about on the machine, just the time it took to
csup and build and such, so if I have to scrag it and start over it's
not a the end of the world.

Maybe someone could make an patched copy of /sbin/zfs (and whatever
dependencies it has into /lib, etc...) available and I could drop them
onto a usb key and use some combination of PATH and LD_LIBRARY_PATH to
use them to get my /usr etc... mounted?

Or I could build up another machine to the same patched point, do the
buildworld and buildkernel, then use that to make a patched bootable
usb drive.  That'll take a while to free up the extra hardware though.

g.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?18575.17497.248521.461931>