Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 17 Apr 2013 15:56:03 -0700
From:      Jeremy Chadwick <jdc@koitsu.org>
To:        Brooks Davis <brooks@FreeBSD.org>
Cc:        svn-src-stable@FreeBSD.org, svn-src-all@FreeBSD.org, src-committers@FreeBSD.org, svn-src-stable-9@FreeBSD.org
Subject:   Re: svn commit: r249549 - in stable/9/sys: amd64/conf i386/conf
Message-ID:  <20130417225603.GA13720@icarus.home.lan>
In-Reply-To: <20130417194706.GA30583@lor.one-eyed-alien.net>
References:  <201304161609.r3GG9SID009937@svn.freebsd.org> <20130416161919.GA80626@icarus.home.lan> <20130417125433.GC30222@caravan.chchile.org> <20130417193538.GB9331@icarus.home.lan> <20130417194706.GA30583@lor.one-eyed-alien.net>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, Apr 17, 2013 at 02:47:06PM -0500, Brooks Davis wrote:
> On Wed, Apr 17, 2013 at 12:35:38PM -0700, Jeremy Chadwick wrote:
> > On Wed, Apr 17, 2013 at 02:54:33PM +0200, Jeremie Le Hen wrote:
> > > Hi Jeremy,
> > > 
> > > On Tue, Apr 16, 2013 at 09:19:19AM -0700, Jeremy Chadwick wrote:
> > > > 
> > > > Now that this has been enabled by default, I should warn folks of a
> > > > caveat that I found in the buildworld/buildkernel framework.  It's
> > > > easiest to explain like this:
> > > > 
> > > > 1. Install FreeBSD 9.x, svn checkout of stable/9, etc...
> > > > 2. Add WITHOUT_CDDL=true to /etc/src.conf
> > > > 3. Rebuild + install kernel/world per src/Makefile procedure
> > > > 4. Remove WITHOUT_CDDL=true from /etc/src.conf
> > > > 5. rm -fr /usr/obj/*
> > > > 6. Rebuild world
> > > > 7. Rebuild kernel -- fails, stating "ctfconvert: not found".
> > > > 
> > > > For whatever reason the buildkernel bits make the assumption that
> > > > ctfconvert exists on the system (presumably in $PATH or possibly a
> > > > hard-coded), when ideally it should try to use the recently-built
> > > > version in /usr/obj first.
> > > 
> > > I've tested this is a freshly installed 9.1-RELEASE jail and I haven't
> > > been biten by the bug you describe.
> > > 
> > > ctfconvert(1) seems to be installed by default in 9.1-RELEASE, this is
> > > probably there problem didn't occur.  I can easily verify this in the
> > > jail:
> > > 
> > > % root@test9:/usr/src # ls -l /usr/bin/ctfconvert /usr/bin/vi /usr/bin/tail 
> > > % -r-xr-xr-x  1 root  wheel  371536 Dec  4 09:33 /usr/bin/ctfconvert
> > > % -r-xr-xr-x  1 root  wheel   19848 Apr 17 06:28 /usr/bin/tail
> > > % -r-xr-xr-x  6 root  wheel  346432 Apr 17 06:28 /usr/bin/vi
> > > 
> > > 
> > > Do you have a theory about why you've got the problem while I haven't?
> > > FYI, it seems 9.0-RELEASE also has ctfconvert(1):
> > > http://svnweb.freebsd.org/base/release/9.0.0/cddl/usr.bin/ctfconvert/
> > > 
> > > My guess is tha this might happen if you don't have /usr/bin/ctfconvert.
> > > I've just removed it and trying to build kernel again.
> > 
> > I will spend some time to figure out exactly how to reproduce this.
> > 
> > Going from recent memory (~2 weeks ago), I encountered it on my VPS box
> > (which does run ntpd, just FYI):
> > 
> > 1. Initially installed with 9.1-RELEASE,
> > 2. Upgraded to stable/9 (using svn),
> > 3. WITHOUT_CDDL=true and WITHOUT_ZFS=true added to /etc/src.conf
> > 4. world/kernel rebuilt/reinstalled/etc. (this includes make delete-old,
> >    as per instructions in src/Makefile -- which would delete
> >    /usr/bin/ctfconvert)
> > 5. Fast forward many months
> > 6. Removed WITHOUT_CDDL=true from src.conf
> > 7. Encountered the above issue ("ctfconvert: not found") during
> >    buildkernel
> > 8. Rebuilt kernel again -- same error
> > 9. Removed WITHOUT_ZFS=true from src.conf
> > 10. Rebuilt kernel again -- worked
> > 
> > This could mean WITHOUT_ZFS=true has some bearing on this situation, but
> > I don't see how/why, as WITHOUT_CDDL is supposed to be the "trigger" for
> > ctf* utilities.  I did poke around the Makefiles and framework a bit
> > but didn't have any epiphanies.
> > 
> > Like I said -- I'll try to reproduce the exact scenario.
> 
> Looking at Makefile.inc1 around line 1164 (on HEAD), ctfconvert and
> cftmerge are bootstrap tools only when they don't exist at all on the
> host.  The code there should be expanded to bootstrap for cases where
> the installed ones are known to be broken (virtually all prior versions
> given recent fixes) as well as when they aren't present on the host
> system.

I'm able to reproduce the issue I speak of with 100% reliability.
Jeremie, I'm not sure why you're not able to reproduce this, because I
can do so reliably/consistently.

Below are my notes.  This was done on a VMware Workstation VM instance,
and I took VM snapshots along the way, so I can "roll back" to almost
any phase/step listed below (in case someone wants me to verify what's
on the filesystem or use "script" to save a log somewhere or run "make
-DA" or something along those lines).

1. Installed 9.1-RELEASE.

2. Added ntpdate_enable and ntpd_enable to /etc/rc.conf, and ran
/etc/rc.d/ntpdate start ; /etc/rc.d/ntpd start.

3. Added following to /etc/src.conf:

WITHOUT_CDDL=true
WITHOUT_CLANG=true
WITHOUT_INET6=true
WITHOUT_IPFILTER=true
WITHOUT_LIB32=true
WITHOUT_KERBEROS=true
WITHOUT_PAM_SUPPORT=true
WITHOUT_SENDMAIL=true
WITHOUT_ZFS=true
WITH_OPENSSH_NONE_CIPHER=true

4. Installed subversion via ports/pkg_add -r and pulled down a fresh
copy of stable/9 (src, r249561) and ports.

5. Verified that /usr/bin/ctfconvert and other CTF utilities exist on
system (obviously part of 9.1-RELEASE).

6. Followed instructions in src/Makefile.  Shown here:

#  1.  `cd /usr/src'       (or to the directory containing your source tree).
#  2.  `make buildworld'
#  3.  `make buildkernel KERNCONF=YOUR_KERNEL_HERE'     (default is GENERIC).
#  4.  `make installkernel KERNCONF=YOUR_KERNEL_HERE'   (default is GENERIC).
#       [steps 3. & 4. can be combined by using the "kernel" target]
#  5.  `reboot'        (in single user mode: boot -s from the loader prompt).
#  6.  `mergemaster -p'
#  7.  `make installworld'
#  8.  `make delete-old'
#  9.  `mergemaster'            (you may wish to use -i, along with -U or -F).
# 10.  `reboot'
# 11.  `make delete-old-libs' (in case no 3rd party program uses them anymore)

Had to do "mergemaster -p" prior to installkernel, but Brooks knows
about this and it's not relevant to the issue.

7. Verified the "make delete-old" phase deleted /usr/bin/ctfconvert and
other CTF utilities; this is normal.

8. Removed WITHOUT_CDDL=true from /etc/src.conf

9. rm -fr /usr/obj/* && cd /usr/src && make -j2 buildworld.

10. Verification of everything at this point -- ctf* utilities are
clearly in /usr/obj, but are not in /usr/bin (because of delete-old):

root@testbox:/usr/src # find /usr/obj -name "ctf*" -type f -perm 0755 -ls
1783448      728 -rwxr-xr-x    1 root             wheel              371738 Apr 17 15:29 /usr/obj/usr/src/cddl/usr.bin/ctfconvert/ctfconvert
1783453       64 -rwxr-xr-x    1 root             wheel               28706 Apr 17 15:29 /usr/obj/usr/src/cddl/usr.bin/ctfdump/ctfdump
1783472      168 -rwxr-xr-x    1 root             wheel               82711 Apr 17 15:29 /usr/obj/usr/src/cddl/usr.bin/ctfmerge/ctfmerge
root@testbox:/usr/src # find /usr/bin -name "ctf*" -type f -perm 0755 -ls
root@testbox:/usr/src #

11. make -j2 buildkernel fails:

*** [aac_disk.o] Error code 127
cc -c -O2 -frename-registers -pipe -fno-strict-aliasing  -std=c99 -g -Wall -Wredundant-decls -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions  -Wmissing-include-dirs -fdiagnostics-show-option   -nostdinc  -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel -mno-red-zone -mno-mmx -mno-sse -msoft-float  -fno-asynchronous-unwind-tables -ffreestanding -fstack-protector -Werror  /usr/src/sys/cam/cam.c
ctfconvert -L VERSION -g cam.o
ctfconvert: not found
*** [cam.o] Error code 127
ctfconvert -L VERSION -g aac_cam.o
ctfconvert: not found
*** [aac_cam.o] Error code 127
2 errors
*** [all] Error code 2
1 error
*** [modules-all] Error code 2
2 errors
*** [buildkernel] Error code 2
1 error
*** [buildkernel] Error code 2
1 error


I'll be doing the following to see where exactly the failure happens
since as we know parallel make causes confusing output sometimes,

rm -fr /usr/obj/* && make -j2 buildworld && make buildkernel

I doubt the parallelism has anything to do with the issue, however -- it
seems very clear cut to me that the issue is that buildkernel assumes
ctfconvert is in one's $PATH, which is not true if you have an active
system with WITHOUT_CDDL=true which you're trying to move *to* have
CDDL.

-- 
| Jeremy Chadwick                                   jdc@koitsu.org |
| UNIX Systems Administrator                http://jdc.koitsu.org/ |
| Mountain View, CA, US                                            |
| Making life hard for others since 1977.             PGP 4BD6C0CB |



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130417225603.GA13720>