Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Nov 2002 10:18:03 -0800 (PST)
From:      David Wolfskill <david@catwhisker.org>
To:        current@freebsd.org
Subject:   Weird error during "make installworld" [executable becomes "data"]
Message-ID:  <200211121818.gACII3oB052659@bunrab.catwhisker.org>

next in thread | raw e-mail | index | archive | help
OK; this is a bit strange, and I've come up with a circumvention (read
"really ugly bloody hack"), but my real concern is that this may be a
manifestation or symptom of something broken in some subtle way.  The
note is rather long- winded; sorry about that, but I didn't see a better
way to do this.

Background:  I track -CURRENT (and -STABLE) daily on both an SMP "build
machine" and on my laptop.  Other thandifferences imposed by the
different hardware types, and the fact that the build machine is normally
run headless (with a serial console), the machines are set up fairly
similarly -- in particular, I mount /tmp on /dev/md10 on both machines.
However, I have never seen this problem on the build machine, but I have
been seeing it regularly on the laptop for the past week or so.

Here's an excerpt of the typescript from the "make installworld":

---%<----- snip! ------------------------------------------
>>> Installing everything..
...
===> share/examples
...
if [ -L /usr/share/examples/bootforth ]; then  rm -f /usr/share/examples/bootforth;  fi
if [ -L /usr/share/examples/cvsup ]; then  rm -f /usr/share/examples/cvsup;  fi
...
if [ -L /usr/share/examples/startslip ]; then  rm -f /usr/share/examples/startslip;  fi
if [ -L /usr/share/examples/sunrpc ]; then  rm -f /usr/share/examples/sunrpc;  fi
if [ -L /usr/share/examples/worm ]; then  rm -f /usr/share/examples/worm;  fi
mtree -deU   -f /usr/src/share/examples/../../etc/mtree/BSD.usr.dist -p /usr   : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                                                                     : not found                   !
                                                  : not found                                                                     : not found
/tmp/install.O6fzOZh4/mtree: 62: Syntax error: "(" unexpected
*** Error code 2
---%<----- snip! ------------------------------------------

Now, watch this:

g1-9(5.0-C)[4] file /tmp/install.O6fzOZh4/* |grep data
/tmp/install.O6fzOZh4/mtree:    data
g1-9(5.0-C)[5] 


Huh??!?

The symptom seems to be associated with some small number of the
executables that are stuffed away in /tmp/install.* for execution during
the "make installworld" starting out OK (yes, I verified this), but at
some point during the installworld, "file" stops identifying them as
executables, and identifies them as mere "data".  This does not appear to
merely be a matter of file modes and flags:

g1-9(5.0-C)[5] ls -lo /tmp/install.O6fzOZh4/mtree
-r-xr-xr-x  1 root  wheel  - 27128 Nov 12 09:37 /tmp/install.O6fzOZh4/mtree

Rather, I suspect that the (swap-backed) image is getting corrupted in
some way:

g1-9(5.0-C)[6] hd -n 32 !$
hd -n 32 /tmp/install.O6fzOZh4/mtree
00000000  00 00 00 00 00 00 00 00  00 00 01 00 00 00 00 00  |................|
00000010  00 00 00 00 02 00 00 00  00 00 00 00 00 00 03 00  |................|
00000020
g1-9(5.0-C)[7] hd -n 32 /tmp/install.O6fzOZh4/mv
00000000  7f 45 4c 46 01 01 01 09  00 00 00 00 00 00 00 00  |.ELF............|
00000010  02 00 03 00 01 00 00 00  c0 80 04 08 34 00 00 00  |........À...4...|
00000020

The circumvention has had 100% success rate in 3 tries so far.  It
consists of the starting up the following in another window once
the "make installworld" is under way:

	while (1)
	  file /tmp/install.*/* | grep data && date && break; sleep 5
	end


Just *why* that appears to be effective is not something I can even
guess right now, but that it does is what suggests to me that there
is something subtly broken somewhere that could bite us badly.

Sometimes it's mtreee that gets clobbered; less often, it's zic.  These are
the only two victims I recall at present.

I may experiment with tweaking the part of installworld that creates the
/tmp/install* directory to make the directory & its contents immutable
as a different possible circumvention.  (In my case, since /tmp is
only swap-backed, I have a fair degree of confidence that anything
put there really is ephemeral, regardless of flags.)

In the case in question, I just blew away the old /tmp/install.O6fzOZh4
directory, fired up the "make installworld" again, then started the
above-cited loop.  And the "make installworld" has now finished,
apparently successfully.

And yes, I know that the "make installworld" should be done in single-
user mode.  I tried that; it does not appear to help.  I *think* I
also saw the symptom one time when I tried doing the installworld in
single-user mode without creating a separate swap-backed /tmp -- though
I confess I am not certain of that as of this writing.

I don't recall seeing similar symptoms being mentioned by anyone else.
Is it plausible that there's something weird about this laptop (Dell
Inspiron 5000e) that might contribute to these symptoms?

Thanks,
david
-- 
David H. Wolfskill				david@catwhisker.org
I have no confidence in results obtained through the use of Microsoft products.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200211121818.gACII3oB052659>