Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 26 Jun 2013 13:31:49 +0200
From:      Michael Gmelin <freebsd@grem.de>
To:        Dimitry Andric <dim@FreeBSD.org>
Cc:        Brooks Davis <brooks@FreeBSD.org>, David Chisnall <theraven@freebsd.org>, "freebsd-ports@freebsd.org Ports" <freebsd-ports@freebsd.org>, Matthias Andree <mandree@FreeBSD.org>
Subject:   Re: Are ports supposed to build and run on 10-CURRENT?
Message-ID:  <20130626133149.4835f14a@bsd64.grem.de>
In-Reply-To: <51CAADB8.7090603@FreeBSD.org>
References:  <20130613031535.4087d7f9@bsd64.grem.de> <EF830CD7-00F1-4628-8515-76133BBE85E7@FreeBSD.org> <C1CC40FC-4489-4164-96B7-5E1A25DCB37F@FreeBSD.org> <20130626015508.426ab5b9@bsd64.grem.de> <51CAADB8.7090603@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 26 Jun 2013 11:00:40 +0200
Dimitry Andric <dim@FreeBSD.org> wrote:

> On 2013-06-26 01:55, Michael Gmelin wrote:
> ...
> > The problem is that static initialization happens in the expected
> > order (same translation unit), but termination does *not* happen in
> > the reverse order of initialization, which - according to the C++
> > standard section 3.6.3 should be guaranteed:
> >
> > "If the completion of the constructor or dynamic initialization of
> > an object with static storage duration is sequenced before that of
> > another, the completion of the destructor of the second is sequenced
> > before the initiation of the destructor of the first."
> >
> > The following conditions have to be met in order to show the
> > problem:
> >
> > 1. Two static objects defined in the translation unit containing
> > main 2. Definition of one of the underlying objects is in a separate
> >     source which is used to build a shared library
> > 3. Both, the translation unit containing main an the one forming the
> >     shared library are compiled using -fPIC (or -fpic).
> 
> Strange, if you compile the main program without -fPIC, the testcase
> works correctly.  Normally, there should never be a need to compile a
> normal executable with -fPIC, but it should not break anything either.

Yep, strange indeed - my test cases didn't use fPIC at first, so it
took a while to figure it out. It's seems to be some sort of
combined link/runtime problem, since the same executable built on 10
runs fine on 9.1-RELEASE when copied over. I tried replacing various
system libraries with their versions from 9.1 in a jail to see if I
could make it run on 10, but to no success.

By the way, the same code built on 9.1 using clang 3.1 or clang 3.3
runs fine on 10 as well, so the only case that does NOT work is build
on 10 and run on 10 using clang. Also, when I link copies of main.o and
libout.so that have been built on 10 on 9.1 using clang33 the problem
doesn't happen as well. So it appears that the problem happens
when linking the executable when one of the objects is position
independent and then only surfaces on 10.

Looking at the binaries using elfdump it's obvious that there are quite
some differences in the generated code of example between the versions,
mostly offsets and sizes, but a few things are actually new in the
version linked on 10:

entry: 29
	st_name: crt_noinit_tag
	st_value: 0x400230
	st_size: 24
	st_info: STT_OBJECT STB_LOCAL
	st_shndx: 2

entry: 30
	st_name: finalizer
	st_value: 0x4007f0
	st_size: 85
	st_info: STT_FUNC STB_LOCAL
	st_shndx: 13

...
entry: 51
	st_name: __preinit_array_start
	st_value: 0x600c94
	st_size: 0
	st_info: STT_NOTYPE STB_LOCAL
	st_shndx: 18

entry: 52
	st_name: __fini_array_end
	st_value: 0x600c94
	st_size: 0
	st_info: STT_NOTYPE STB_LOCAL
	st_shndx: 18
...
entry: 55
	st_name: __preinit_array_end
	st_value: 0x600c94
	st_size: 0
	st_info: STT_NOTYPE STB_LOCAL
	st_shndx: 18

entry: 56
	st_name: __fini_array_start
	st_value: 0x600c94
	st_size: 0
	st_info: STT_NOTYPE STB_LOCAL
	st_shndx: 18

entry: 57
	st_name: __init_array_end
	st_value: 0x600c94
	st_size: 0
	st_info: STT_NOTYPE STB_LOCAL
	st_shndx: 18

entry: 58
	st_name: __init_array_start
	st_value: 0x600c94
	st_size: 0
	st_info: STT_NOTYPE STB_LOCAL
	st_shndx: 18

Based on this I would *speculate* that the problem first appeared when
r232832 was committed [1] and there's something wrong with the order of
how fini_array is filled b the linker (or traversed later).
Unfortunately I don't even have an instance/VM of 10 ready in which I
could experiment hacking that deep into the system and I don't really
have the expertise to do this in a efficiently anyway, so at this point
I'll have to leave it up to you.

> 
> I am not sure what causes this, but I will investigate.  For now, the
> workaround is to compile only shared objects with -fPIC.

For many ports this is not a real option I'm afraid (simply too much
work to factor out the use of CXXFLAGS in various places). I assume
it's best to wait for a solution of the underlying problem.

> 
> -Dimitry


[1] http://svnweb.freebsd.org/base?view=revision&revision=232832



-- 
Michael Gmelin



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130626133149.4835f14a>