From owner-freebsd-current@FreeBSD.ORG Sat Jul 31 00:28:36 2004 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E382916A4CE for ; Sat, 31 Jul 2004 00:28:35 +0000 (GMT) Received: from freefall.freebsd.org (freefall.freebsd.org [216.136.204.21]) by mx1.FreeBSD.org (Postfix) with ESMTP id BDF5F43D3F for ; Sat, 31 Jul 2004 00:28:35 +0000 (GMT) (envelope-from kan@FreeBSD.org) Received: from freefall.freebsd.org (kan@localhost [127.0.0.1]) by freefall.freebsd.org (8.12.11/8.12.11) with ESMTP id i6V0RDNW006928 for ; Sat, 31 Jul 2004 00:27:13 GMT (envelope-from kan@freefall.freebsd.org) Received: (from kan@localhost) by freefall.freebsd.org (8.12.11/8.12.11/Submit) id i6V0RDAO006927 for freebsd-current@freebsd.org; Sat, 31 Jul 2004 00:27:13 GMT (envelope-from kan) Date: Sat, 31 Jul 2004 00:27:13 +0000 From: Alexander Kabaev To: freebsd-current@freebsd.org Message-ID: <20040731002713.GA6709@freefall.freebsd.org> References: <20040730212843.GA33955@parodius.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20040730212843.GA33955@parodius.com> User-Agent: Mutt/1.4.1i Subject: Re: boot2 -- Round 2 X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 31 Jul 2004 00:28:36 -0000 On Fri, Jul 30, 2004 at 02:28:43PM -0700, Jeremy Chadwick wrote: > So, in regards to the commited fix: > > This seemed to fix the issue on one of my boxes (the one which was > flat-out panic'ing, not the one which was reporting 0:ad(0,`) as the > default slice to load /boot/loader from). I'll refer to the one which > panic'd as "Box A" while the one which is doing the backtick as "Box B". > > After pulling cvs down last night and rebuilding world+kernel+boot > blocks, running disklabel -B ad0s1, all on Box B, I found the machine > once again spitting out "Invalid partition", trying to load loader(8) > off of 0:ad(0,`) instead of 0:ad(0,a). I double-checked boot2/Makefile > to see if -fno-unit-at-a-time was in place -- and it was. > > I've tried using /boot/boot off of Box A and applying it to Box B using > disklabel -B -b /boot/box_b/boot ad0s1 to no avail. > > It seems almost as if the boot2 code is broken in such a way that it > resembles an "off-by-one" error (ASCII 0x60 == `, ASCII 0x61 == a). > Why it's picking ` is beyond me... > > Can someone shed some light as to how I can go about debugging this, > as well as mention how I can temporarily work around this? Box B > happens to run mysqld, and is suffering from some issues mentioned on > freebsd-threads (re: machine randomly hard-locking), so it definitely > needs to be able to boot back up on it's own without my intervention. > > Thanks! Hi, I guess I would like to get your /boot/boot. The one I got simply works on all boxes in my home :(. As another option, you can try an alternative patch which was proposed by Tim Robbins. Since the problem was apparently caused by me going back to static memcpy implementation, I am currenly working on using builtin memcpy as it was used before. I will post it later after I've done some more testing and if things will look good. -- Alexander Kabaev ======== Begin quote ============== After a few hours of head-scratching, I've tracked down the problem with boot2 and -funit-at-a-time, and come up with a patch that makes it work: ==== //depot/user/tjr/freebsd-tjr/src/sys/boot/i386/boot2/boot2.c#7 - /home/tim/p4/src/sys/boot/i386/boot2/boot2.c ==== @@ -139,7 +139,16 @@ static int xgetc(int); static int getc(int); -static void memcpy(void *, const void *, int); +/* + * GCC 3.4 with -funit-at-a-time (implied by -Os) may use a non-standard + * calling convention for static functions, using registers to pass arguments + * instead of the stack. However, GCC may emit calls to memcpy() when a + * program copies a struct with the assignment operator, and the code it + * emits to call memcpy() uses the standard convention, not the register + * convention. This means we must declare our memcpy() implementation "__used" + * to disable the register calling convention. + */ +static void memcpy(void *, const void *, int) __used; static void memcpy(void *dst, const void *src, int len) { I think this is a bug in GCC; it should emit a warning if it's about to emit code to call memcpy(), but finds that memcpy() has a prototype that conflicts with the assumptions it makes. Tim