From owner-freebsd-fs@FreeBSD.ORG Tue Jul 29 16:24:59 2008 Return-Path: Delivered-To: freebsd-fs@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 54565106566B; Tue, 29 Jul 2008 16:24:59 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (merlin.alerce.com [64.62.142.94]) by mx1.freebsd.org (Postfix) with ESMTP id 370F08FC1D; Tue, 29 Jul 2008 16:24:58 +0000 (UTC) (envelope-from hartzell@alerce.com) Received: from merlin.alerce.com (localhost [127.0.0.1]) by merlin.alerce.com (Postfix) with ESMTP id 3959833C62; Tue, 29 Jul 2008 09:25:18 -0700 (PDT) Received: from postfix.alerce.com (w092.z064001164.sjc-ca.dsl.cnc.net [64.1.164.92]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by merlin.alerce.com (Postfix) with ESMTP id BAD5833C5B; Tue, 29 Jul 2008 09:25:17 -0700 (PDT) Received: by postfix.alerce.com (Postfix, from userid 501) id 5A3A34953DD; Tue, 29 Jul 2008 09:24:57 -0700 (PDT) From: George Hartzell MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Message-ID: <18575.17497.248521.461931@almost.alerce.com> Date: Tue, 29 Jul 2008 09:24:57 -0700 To: freebsd-fs@FreeBSD.org, freebsd-current@freebsd.org In-Reply-To: <488E246D.2030508@chruetertee.ch> References: <20080727125413.GG1345@garage.freebsd.pl> <488E246D.2030508@chruetertee.ch> X-Mailer: VM 7.19 under Emacs 22.1.50.1 X-Virus-Scanned: ClamAV using ClamSMTP Cc: =?ISO-8859-1?Q?Beat_G=E4tzi?= Subject: Re: ZFS patches. [Problem with root on zfs and upgrading] X-BeenThere: freebsd-fs@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: hartzell@alerce.com List-Id: Filesystems List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 29 Jul 2008 16:24:59 -0000 Beat G=E4tzi writes: > Hi, >=20 > Pawel Jakub Dawidek wrote: > > The patch above contains the most recent ZFS version that could be= found > > in OpenSolaris as of today. Apart for large amount of new function= ality, > > I belive there are many stability (and also performance) improveme= nts > > compared to the version from the base system. >=20 > Thanks for the great work! >=20 > > Please test, test, test. If I get enough positive feedback, I may = be > > able to squeeze it into 7.1-RELEASE, but this might be hard. >=20 > I have a amd64 box with 8GB RAM running CURRENT-200806 snapshot. I g= et > the latest version of the sources with csup, applied your patch and > build the world/kernel. > /usr/src and /usr/obj are located on a zfs file system. After "make > installkernel" and reboot into single user mode I had to start the z= fs > file system but it failed: >=20 > # fsck > # mount -a > # /etc/rc.d/hostid start > Setting hostuuid: ... > Setting hostid: ... > # /etc/rc.d/zfs start > lock order reversal: > 1st 0xffffff0004832620 ufs (ufs) @ /usr/src/sys/kern/vfs_subr.c:205= 3 > 2nd 0xffffffff80b09da0 kernel linker (kernel linker) @ > /usr/src/sys/kern/kern_linker.c:693 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > witness_checkorder() at witness_checkorder+0x609 > _sx_xlock() at _sx_xlock+0x52 > linker_file_lookup_set() at linker_file_lookup_set+0xe1 > linker_file_register_sysctls() at linker_file_register_sysctls+0x20 > linker_load_module() at linker_load_module+0x919 > linker_load_dependencies() at linker_load_dependencies+0x1bc > link_elf_load_file() at link_elf_load_file+0xa96 > linker_load_module() at linker_load_module+0x8cf > kern_kldload() at kern_kldload+0xac > kldload() at kldload+0x84 > syscall() at syscall+0x1bf > Xfast_syscall() at Xfast_syscall+0xab > --- syscall (304, FreeBSD ELF64, kldload), rip =3D 0x80068561c, rsp = =3D > 0x7fffffffec88, rbp =3D 0 --- > This module (opensolaris) contains code covered by the > Common Development and Distribution License (CDDL) > see http://opensolaris.org/os/licensing/opensolaris_license/ > WARNING: ZFS is considered to be an experimental feature in FreeBSD.= > ZFS filesystem version 11 > ZFS storage pool version 11 > internal error: out of memory > internal error: out of memory > internal error: out of memory > internal error: out of memory >=20 > Running "zpool list" shows no available pool and the "internal error= : > out of memory" error message. >=20 > The same problem occurs in multi-user mode. loader.conf is set to: > vm.kmem_size_max=3D"2147483648" > vm.kmem_size=3D"2147483648" >=20 > Increase/remove the kmem_size-values didn't change anything. >=20 > To solve the problem I had to boot kernel.old and run make > installworld/mergemaster. After rebooting with the new kernel the po= ol > was available again and everything work without a problem. >=20 > Did I do something wrong when I upgraded the server? I'm being bitten by the problem that bit Beat, but worse. I'm running a root on zfs system, built using variations of Yarema's tools (which do a great job of rounding up and automating all of the little tips and tricks about putting your root on a zfs filesystem, you should read and understand what they're doing though, you'll probably need to adapt them a bit... [ http://yds.coolrat.org/zfsboot.shtml ]). I moved a computer from -STABLE up to -CURRENT via csup and rebuilt everything to convince myself that the upgrade went well. Then I applied Pawel's patch (-p0 -E), and: make buildworld make buildkernel KERNCONF=3DBLUETOO make installkernel KERNCONF=3DBLUETOO and rebooted. I planned to drop down to single user and do the mergemaster/installworld. When I try to boot multi user things go south and it's clear that /usr et al. is missing. I can boot my new kernel single user and my root gets mounted from my zpool, but none of my other zfs filesystems are mounted, and when I try to run zfs list or zpool status I got the same out of memory message that Beat sees. The ZFS filesystem and pool are at version 11 (seen scrolling by on the console). I suspect that my newer kernel isn't cooperating with the older userland utilities which prevents the filesystems from being mounted. I tried to boot from kernel.old, but I end up at the mountroot prompt and can't mount my root. Presumably since my pool has been automagically upgraded to version 11 I can no longer mount my root using kernel.old, so Beat's end-run won't help me. There's nothing I care about on the machine, just the time it took to csup and build and such, so if I have to scrag it and start over it's not a the end of the world. Maybe someone could make an patched copy of /sbin/zfs (and whatever dependencies it has into /lib, etc...) available and I could drop them onto a usb key and use some combination of PATH and LD_LIBRARY_PATH to use them to get my /usr etc... mounted? Or I could build up another machine to the same patched point, do the buildworld and buildkernel, then use that to make a patched bootable usb drive. That'll take a while to free up the extra hardware though. g.