From owner-freebsd-stable@FreeBSD.ORG Tue Jul 26 12:50:06 2011 Return-Path: Delivered-To: freebsd-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 51D77106566C for ; Tue, 26 Jul 2011 12:50:06 +0000 (UTC) (envelope-from jherman@dichotomia.fr) Received: from mail.dichotomia.fr (hydrogen.dichotomia.net [91.121.82.228]) by mx1.freebsd.org (Postfix) with ESMTP id 006A18FC14 for ; Tue, 26 Jul 2011 12:50:05 +0000 (UTC) Received: from [192.168.1.18] (unknown [178.33.164.134]) (Authenticated sender: kha@dichotomia.fr) by sslmail.dichotomia.fr (Postfix) with ESMTPSA id 73E843DD069; Tue, 26 Jul 2011 14:46:55 +0200 (CEST) Message-ID: <4E2EB814.9040704@dichotomia.fr> Date: Tue, 26 Jul 2011 14:50:28 +0200 From: Jerome Herman User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20110624 Thunderbird/5.0 MIME-Version: 1.0 To: Jeremy Chadwick References: <4E2E9F24.1040108@dichotomia.fr> <20110726114438.GA86683@icarus.home.lan> In-Reply-To: <20110726114438.GA86683@icarus.home.lan> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.2.7 (sslmail.dichotomia.fr); Tue, 26 Jul 2011 14:46:55 +0200 (CEST) Cc: freebsd-stable@freebsd.org Subject: Re: Making world but no kernel X-BeenThere: freebsd-stable@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Production branch of FreeBSD source code List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 26 Jul 2011 12:50:06 -0000 On 26/07/2011 13:44, Jeremy Chadwick wrote: > On Tue, Jul 26, 2011 at 01:04:04PM +0200, Jerome Herman wrote: >> I would like to know if it is possible to rebuild world, but without >> upgrading or even compiling the kernel. >> >> The problem is such : I am presently working on a FreeBSD station >> that seems to have quite a lot of problem, notably with fsck. I am >> starting to wonder whether this BSD station was properly installed, >> or if some of the system tools were pasted from older FreeBSD setup. >> Since the machine is in a remote location, I would prefer to avoid >> full reinstall if possible. Among other things, single user mode is >> not available. >> >> So I was wondering, if I get the full sources with sysinstall, can I >> make buildworld and then installworld without going through the >> kernel phase or would this be a bad idea ? > Is it possible? Yes. Is it a bad idea? Generally yes. World and > kernel effectively need to be "in sync"; some kernel binary structures > (particularly for things like libkvm) need to be what userland binaries > expect them to be. Nobody will be able to provide any support for this > configuration. I think kernel and world are already out of sync. This machine is a pre-installed BSD from an ISP, and I have no clues as to how it was done. But I suspect that world was not built or rebuilt properly. I of course got the sources that matches my kernel, and plan to reinstall world just to make sure it is in sync with kernel. > > If you're trying to do things ""in phases"" because of this "fsck > problem" (see below for more on that), then please be sure that after > you rebuild world and reinstall world, that you DO NOT empty out > /usr/obj before rebuilding kernel/reinstalling kernel. The kernel build > does refer to things in /usr/obj which were built as a result of > buildworld. Yes I know the entire compilation chain is in /usr/obj for make kernel. So I won't touch it until I can see clearer on this box. > > All that said: can we please get some deeper insight as to this > "problems with fsck" you're referring to? I'm of the strong opinion > that it's better to try and solve the root cause of an issue than do > "hackish stuff" like the above (though it's not that hackish, you get > what I mean I hope). I don't understand how fsck would cause you a > problem unless the machine is constantly losing power or has serious > issues with its storage. Neither one, nor the other. I have a gvinum setup for data disks. After a forced reboot due to power failure, the box would not come up. Booting into rescue drive I realized that it refused to boot because it could not mount the data partition (/dev/gvinum/data), and this in turn because fsck would not work on the said partition. So I turned off daemons, removed /dev/gvinum/data from fstab and booted again. No problems. Tried to fsck /dev/gvinum/data and got fsck: Could not determine filesystem type fsck_ufs /dev/gvinum/data got stuck on phase 1 for 8 hours before I hard-canceled it. trying to mount the drive resulted in mount: /dev/gvinum/data : Operation not permitted gvinum list giving the following informations : 3 drives: D c State: up /dev/ad7 A: 1/1430799 MB (0%) D a State: up /dev/ad5 A: 1/1430799 MB (0%) D b State: up /dev/ad6 A: 1/1430799 MB (0%) 1 volume: V data State: up Plexes: 2 Size: 2095 GB 2 plexes: P data.p0 S State: up Subdisks: 3 Size: 2095 GB P data.p1 S State: up Subdisks: 3 Size: 2095 GB 6 subdisks: S data.p0.s0 State: up D: a Size: 698 GB S data.p0.s1 State: up D: b Size: 698 GB S data.p0.s2 State: up D: c Size: 698 GB S data.p1.s0 State: up D: c Size: 698 GB S data.p1.s1 State: up D: a Size: 698 GB S data.p1.s2 State: up D: b Size: 698 GB The I did a newfs on the drive, which went well, and I was able to mount it again without any problem. Still testing I decided to umount the drive and to use fsck on it. Same problems came back. Unable to fsck simply, fsck_ufs getting stuck on phase 1 and mount returning "operation not permitted". Newfs again - no problems. Mount again - no problems. Destroyed the gvinum drive, made every disk into a standard UFS drive and fsck on each of them : no problems. Tried to create FreeBSD partition with gvinum slices instead of using disk directly : same old, same old. So here I am starting to think that my disklabel and fsck are not in sync with my kernel. > > Are you sure the problem, for example, isn't with the underlying storage > device (disk)? If you aren't sure, would you like to verify that's > not the problem piece? If so, please post some details like: > > * dmesg > * Contents of /etc/fstab > * sysctl kern.disks > > If the disks are backed by ata(4): > > * atacontrol list > * atacontrol cap XXX (where XXX = each disk shown in kern.disks) > > If the disks are backed by ada(4) or are SCSI (da(4)): > > * camcontrol devlist > * For ada(4) disks only: camcontrol identify XXX > * For da(4) disks only: camcontrol inquiry XXX > > And regardless of if ata(4), ada(4), or da(4): > > * smartctl -a /dev/XXX (where XXX = each disk shown in kern.disks; this > will require you install ports/sysutils/smartmontools first) > > I can assist with the disk analysis portion in particular. > > And with regards to smartctl, please try to ensure the output doesn't > get munged (forced line wrapping, newlines injected, etc.). It makes it > more difficult to read. Put the output up on the web if you're worried > about this. >