Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 18 Jun 2018 18:27:27 -0400
From:      Li-Wen Hsu <lwhsu@freebsd.org>
To:        Mark Millard <marklmi@yahoo.com>
Cc:        Bryan Drewery <bdrewery@freebsd.org>, FreeBSD Current <freebsd-current@freebsd.org>,  FreeBSD Toolchain <freebsd-toolchain@freebsd.org>
Subject:   Re: A head buildworld race visible in the ci.freebsd.org build history
Message-ID:  <CAKBkRUxAfXi81yw93ejcJVpXQ0JetaACFtuS8tFprQvMeWx75A@mail.gmail.com>
In-Reply-To: <BCD47660-EE57-490C-90E8-83FC3B720B09@yahoo.com>
References:  <74EAD684-0E0B-453A-B746-156777CF604A@yahoo.com> <1884103f-d1fb-aca6-2edd-062e11d05617@FreeBSD.org> <BCD47660-EE57-490C-90E8-83FC3B720B09@yahoo.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jun 18, 2018 at 5:04 PM Mark Millard via freebsd-toolchain
<freebsd-toolchain@freebsd.org> wrote:
>
> On 2018-Jun-18, at 12:42 PM, Bryan Drewery <bdrewery at FreeBSD.org> wrote:
>
> > On 6/15/2018 10:55 PM, Mark Millard wrote:
> >> In watching ci.freebsd.org builds I've seen a notable
> >> number of one time failures, such as (example from
> >> powerpc64):
> >>
> >> --- all_subdir_lib/libufs ---
> >> ranlib -D libufs.a
> >> ranlib: fatal: Failed to open 'libufs.a'
> >> *** [libufs.a] Error code 70
> >>
> >> where the next build works despite the change being
> >> irrelevant to whatever ranlib complained about.
> >>
> >> Other builds failed similarly:
> >>
> >> --- all_subdir_lib/libbsm ---
> >> ranlib -D libbsm_p.a
> >> ranlib: fatal: Failed to open 'libbsm_p.a'
> >> *** [libbsm_p.a] Error code 70
> >>
> >> and:
> >>
> >> --- kerberos5/lib__L ---
> >> ranlib -D libgssapi_spnego_p.a
> >> --- libgssapi_spnego.a ---
> >> ranlib -D libgssapi_spnego.a
> >> --- libgssapi_spnego_p.a ---
> >> ranlib: fatal: Failed to open 'libgssapi_spnego_p.a'
> >> *** [libgssapi_spnego_p.a] Error code 70
> >>
> >> and so on.
> >>
> >>
> >> It is not limited to powerpc64. For example, for aarch64
> >> there are:
> >>
> >> --- libpam_exec.a ---
> >> building static pam_exec library
> >> ar -crD libpam_exec.a `NM='nm' NMFLAGS=''  lorder pam_exec.o  | tsort -q`
> >> ranlib -D libpam_exec.a
> >> ranlib: fatal: Failed to open 'libpam_exec.a'
> >> *** [libpam_exec.a] Error code 70
> >>
> >> and:
> >>
> >> --- all_subdir_lib/libusb ---
> >> ranlib -D libusb.a
> >> ranlib: fatal: Failed to open 'libusb.a'
> >> *** [libusb.a] Error code 70
> >>
> >> and:
> >>
> >> --- all_subdir_lib/libbsnmp ---
> >> ranlib: fatal: Failed to open 'libbsnmp.a'
> >> --- all_subdir_lib/ncurses ---
> >> --- all_subdir_lib/ncurses/panelw ---
> >> --- panel.pico ---
> >> --- all_subdir_lib/libbsnmp ---
> >> *** [libbsnmp.a] Error code 70
> >>
> >>
> >> Even amd64 gets such:
> >>
> >> --- libpcap.a ---
> >> ranlib -D libpcap.a
> >> ranlib: fatal: Failed to open 'libpcap.a'
> >> *** [libpcap.a] Error code 70
> >>
> >> and:
> >>
> >>
> >> --- libkafs5.a ---
> >> ranlib: fatal: Failed to open 'libkafs5.a'
> >> --- libkafs5_p.a ---
> >> ranlib: fatal: Failed to open 'libkafs5_p.a'
> >> --- cddl/lib__L ---
> >> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/lua/lbaselib.c:60:26: note: include the header <ctype.h> or explicitly provide a declaration for 'toupper'
> >> --- kerberos5/lib__L ---
> >> *** [libkafs5_p.a] Error code 70
> >>
> >> make[5]: stopped in /usr/src/kerberos5/lib/libkafs5
> >> --- libkafs5.a ---
> >> *** [libkafs5.a] Error code 70
> >>
> >> and:
> >>
> >>
> >> --- lib__L ---
> >> ranlib -D libclang_rt.asan_cxx-i386.a
> >> ranlib: fatal: Failed to open 'libclang_rt.asan_cxx-i386.a'
> >> *** [libclang_rt.asan_cxx-i386.a] Error code 70
> >>
> >>
> >> (Notice the variability in what .a the ranlib's fail for.)
> >>
> >>
> >>
> >>
> >>
> >
> >
> > I looked at this a few days ago and don't believe it's actually a build
> > race. I think there is something wrong with the ar/ranlib on that system
> > or something else. I've found no evidence of concurrent building of the
> > .a files in question.
>
>
> Looking at a bunch of the failures, spanning multiple
> FreeBSD-head-*-build types of builds, I see only:
>
> NODE_LABELS     bhyve_host butler1.nyi.freebsd.org jailer jailer_fast
> NODE_NAME       butler1.nyi.freebsd.org
>
> for the failures that I looked at.
>
> So your "on that system" might well be correct.

Thanks for the insight, the build is done in a 11.1-R jail on a
-CURRENT host.  butler1.nyi is running r333388 (as a canary) while
other builders are mostly running r328278.  I upgraded few others and
it seems can reproduce the issue, and now I downgraded all the build
slaves to r328278 before we find the root cause.

Li-Wen

--
Li-Wen Hsu <lwhsu@FreeBSD.org>
https://lwhsu.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAKBkRUxAfXi81yw93ejcJVpXQ0JetaACFtuS8tFprQvMeWx75A>