Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 30 Jul 2004 18:23:20 -0700 (PDT)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Alexander Kabaev <kan@freebsd.org>
Cc:        freebsd-current@freebsd.org
Subject:   Re: boot2 -- Round 2
Message-ID:  <200407310123.i6V1NKHf085934@apollo.backplane.com>
References:  <20040730212843.GA33955@parodius.com> <20040731002713.GA6709@freefall.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
    I had a similar problem with boot2 displaying 0:ad(0,<garbage>) [garbage].

    The problem turned out to be boot1's fault.  Boot1 is apparently 
    responsible for clearing boot2's BSS.  The problem is that boot1 does
    not really have all that clear an idea about where boot2's BSS is or,
    especially, how large it is.  boot1 only clears to the end of the segment
    but I'm not even sure that it is calculating the *start* address properly.

    Now this might not be the same problem.  I've completely reorganized most
    of the boot code in DragonFly and the cause of the mismatched BSS in our
    boot1/boot2 might have been due to something I did.  But the error was
    that boot1 was not clearing enough of the BSS which caused a number of
    boot2's globals to be garbage on startup, which in turn caused boot2
    to believe that the partition and load path had already been
    set (to garbage).  Hence the displayed garbage.

    The fix I made in DragonFly was to move the BSS clearing out of boot1
    and into i386/btx/lib/btxcsu.s where the size of the BSS is known.
    This required some surgery, however, because it bloated boot2 past
    its size limit (but also made boot1 smaller).

    Actually if someone over in FreeBSD land is interested in cleaning up
    your boot code, I would recommend starting with DragonFly's and then
    making it work with FreeBSD again (which shouldn't be too difficult).
    Amoung other things I reorganized *ALL* the hardwired origins into a
    single header file and it is now possible to change most of them
    at will and still get something that works out of it.  The FreeBSD
    boot code has some historical issues which could cause interference
    with certain BIOSes, such as using 0x1000 as the top of the transfer
    stack and using other similarly nasty addresses that it probably shouldn't
    be.  (But, that said, I still can't get either the FreeBSD or the DFly
    boot code to boot my Shuttle AMD64 boxes if the mouse is not plugged in.
    The BIOS gets ultra confused over the amount of BIOS memory available and
    trashes the memory table... but boots linux just fine).

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>


:On Fri, Jul 30, 2004 at 02:28:43PM -0700, Jeremy Chadwick wrote:
:> So, in regards to the commited fix:
:> 
:> This seemed to fix the issue on one of my boxes (the one which was
:> flat-out panic'ing, not the one which was reporting 0:ad(0,`) as the
:> default slice to load /boot/loader from).  I'll refer to the one which
:> panic'd as "Box A" while the one which is doing the backtick as "Box B".
:> 
:> After pulling cvs down last night and rebuilding world+kernel+boot
:> blocks, running disklabel -B ad0s1, all on Box B, I found the machine
:> once again spitting out "Invalid partition", trying to load loader(8)
:> off of 0:ad(0,`) instead of 0:ad(0,a).  I double-checked boot2/Makefile
:> to see if -fno-unit-at-a-time was in place -- and it was.
:> 
:> I've tried using /boot/boot off of Box A and applying it to Box B using
:> disklabel -B -b /boot/box_b/boot ad0s1 to no avail.
:> 
:> It seems almost as if the boot2 code is broken in such a way that it
:> resembles an "off-by-one" error (ASCII 0x60 == `, ASCII 0x61 == a).
:> Why it's picking ` is beyond me...
:> 
:> Can someone shed some light as to how I can go about debugging this,
:> as well as mention how I can temporarily work around this?  Box B
:> happens to run mysqld, and is suffering from some issues mentioned on
:> freebsd-threads (re: machine randomly hard-locking), so it definitely
:> needs to be able to boot back up on it's own without my intervention.
:> 
:> Thanks!
:Hi,
:
:I guess I would like to get your /boot/boot. The one I got simply works
:on all boxes in my home :(.
:
:As another option, you can try an alternative patch which was proposed
:by Tim Robbins. Since the problem was apparently caused by me going back to
:static memcpy implementation, I am currenly working on using builtin
:memcpy as it was used before. I will post it later after I've done some
:more testing and if things will look good.
:
:--
:Alexander Kabaev
:
:======== Begin quote ==============
:
:After a few hours of head-scratching, I've tracked down the problem with
:boot2 and -funit-at-a-time, and come up with a patch that makes it work:
:
:==== //depot/user/tjr/freebsd-tjr/src/sys/boot/i386/boot2/boot2.c#7 - /home/tim/p4/src/sys/boot/i386/boot2/boot2.c ====
:@@ -139,7 +139,16 @@
: static int xgetc(int);
: static int getc(int);
: 
:-static void memcpy(void *, const void *, int);
:+/*
:+ * GCC 3.4 with -funit-at-a-time (implied by -Os) may use a non-standard
:+ * calling convention for static functions, using registers to pass arguments
:+ * instead of the stack. However, GCC may emit calls to memcpy() when a
:+ * program copies a struct with the assignment operator, and the code it
:+ * emits to call memcpy() uses the standard convention, not the register
:+ * convention. This means we must declare our memcpy() implementation "__used"
:+ * to disable the register calling convention.
:+ */
:+static void memcpy(void *, const void *, int) __used;
: static void
: memcpy(void *dst, const void *src, int len)
: {
:
:
:I think this is a bug in GCC; it should emit a warning if it's about to emit
:code to call memcpy(), but finds that memcpy() has a prototype that conflicts
:with the assumptions it makes.
:
:
:Tim



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200407310123.i6V1NKHf085934>