Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 23 Jan 2010 00:48:25 -0600
From:      Billy Newsom <billy@nlcc.us>
To:        freebsd-questions@freebsd.org
Subject:   Re: How to troubleshoot a frozen boot sequence
Message-ID:  <4B5A9BB9.2070801@nlcc.us>
In-Reply-To: <795fc2b81001221030n321c994cv9fd3c76b981fead0@mail.gmail.com>
References:  <4B59E61B.3090504@nlcc.us> <795fc2b81001221030n321c994cv9fd3c76b981fead0@mail.gmail.com>

Next in thread | Previous in thread | Raw E-Mail | Index | Archive | Help
Nathan Vidican wrote:
 > To me, it sounds like you have two issues to deal with here:
 >
 > #1 - booting off of the twed0 disk, what is your systems' BIOS currently
 > set to boot from, from the way you describe it's almost as if the system
 > is booting from ad0 - in which case yes, you will have to put a valid
 > boot config onto twed0

I feel that I have run across a common and old "SCSI v IDE" battle (The 
FreeBSD Handbook still talks about it). Even though I make the drive 
controller (the twe = 3Ware SATA controller) as my first boot drive in BIOS 
(effectively 0x80 as I understand it), FreeBSD does not ever pay attention to 
the BIOS's numerical order. (See my reason below*) It wants to find stuff on 
ad0 and boot that drive if it exists.

My supposition is that since I had twe0 and ad0 running during my 7.2 install, 
that the correct drive partition and MBR stuff were applied to get it to boot 
AS-IS, but...

When it is not as it is now, It freezes at the boot loader, attempting to find 
ad0.

It is either

a. Finding ad0 in fstab and really wishing it was there
or
b. The boot strap code is physically on ad0 and not twed0 because the 
Sysinstall process never wrote it there.

I think it is b. If b, the boot process may be:

Stage 1: BIOS picks twe0 to be the first drive to attempt a boot.
Stage 2: MBR (boot 0) -- located on twe0
Stage 3: boot1 -- located on twed0 (BTX Boot Loader?)
Stage 4: boot2 -- located on ad0 (FreeBSD/i386 bootstrap loader 1.1?)
Stage 5: Boot Loader -- shows menu on twed0s1a
Stage 6: Kernel boots up on twed0s1a

And so when I remove ad0 to simulate a backup drive failure, the stage 4 tries 
to run a missing bootstrap loader from twed0.

Stage 4: boot2 -- missing on twed0, system hangs.

I think this is happening because it is the BTX loader which may find and 
concatenate the BIOS drives, getting confused, and switching the boot to ad0 
for just the one stage that finishes the bootstrap.

I think one solution is to (next time) not install my backup drive until after 
Sysinstall is long done! I think it's a sysinstall bug, some of this.

* My Reason for saying that is my guess that the sysinstall program saw the 
ad0 as something important, and included it in the chain of the boot. For 
example, when I was done SLICING my drives in Sysinstall, the silly thing then 
got the "w" write command and went out there and made some (wrong) decisions 
under the assumption that ad0 would NATURALLY (via BIOS) be part of the boot 
process. So the right code never got written to twe0 in the right places. 
Sure, it got all the kernel and I told it to put a standard FreeBSD MBR, but 
it must be missing something on track 0.

 > #2 - you could add the flag 'noauto' to ad0 from within fstab - this
 > will allow the system to boot without mounting the disk (alleviating the
 > dreaded single-user-mode). Use a startup script in /usr/local/etc/rc.d
 > to then mount the disk if available on bootup. I've done similar setups
 > to this before where we were using external USB drives for backup and
 > weren't 100% sure they'd always be connected in the case a server might
 > be rebooted - worst case, you'll end up with it not mounted, but the
 > system will still be up at least.

I will give it a try. I need to do something to correct this second issue for 
certain. My ad0 is a good spare, but it's old.

 > --
 > Nathan Vidican
 > nathan@vidican.com <mailto:nathan@vidican.com>
 >
 >
 > On Fri, Jan 22, 2010 at 12:53 PM, Billy Newsom <billy@nlcc.us
 > <mailto:billy@nlcc.us>> wrote:
 >
 >     I am doing a test run on a production server. It has 2 hard drives.
 >
 >     ad0 (mounted on /disk250 in a single slice plus SWAP)
 >     twed0 (mounted on / /var /usr and a SWAP)
 >
 >     The twed0 is a hardware mirror and my main drive.
 >     ad0 is just for backups.
 >
 >     What the issue is, and you probably know where I'm heading. The boot
 >     process freezes if I remove the ad0 (to test a drive failure condition)
 >
 >     It freezes after saying:
 >     BTX boot loader.... etc.
 >
 >     FreeBSD/i386 bootstrap loader 1.1
 >     It spins for a second, then stops... unless I have ad0 in the computer.
 >     /boot/kernel/kernel text=0x7b03a0 data=0xcdee0 /
 >
 >     And it never gets to the boot menu.
 >
 >     So:
 >
 >     1. Should I put a new boot0config on the twed0 drive? If so do I
 >     boot from a CD to do that?
 >
 >     I need to potentially do something also to my disk labels and my
 >     fstab so that I don't boot to single user mode if drive ad0 fails. I
 >     haven't done this exact type of thing before, so I am looking for a
 >     little help.
 >
 >     my fstab:
 >     /dev/ad0s1b             none            swap    sw              0
 >         0
 >     /dev/twed0s1b           none            swap    sw              0
 >         0
 >     /dev/twed0s1a           /               ufs     rw              1
 >         1
 >     /dev/ad0s1d             /disk250                ufs     rw      2
 >         2
 >     /dev/twed0s1e           /tmp            ufs     rw              2
 >         2
 >     /dev/twed0s1f           /usr            ufs     rw              2
 >         2
 >     /dev/twed0s1d           /var            ufs     rw              2
 >         2
 >     /dev/acd0               /cdrom          cd9660  ro,noauto       0
 >         0
 >
 >
 >     I tried to read the MBR from the twed0 drive, and the program
 >     couldn't read it. The one from the ad0 drive is readable and I saved
 >     a copy of it.




Want to link to this message? Use this URL: <http://docs.FreeBSD.org/cgi/mid.cgi?4B5A9BB9.2070801>