Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 23 Jun 2013 08:09:05 +0200
From:      Michael Gmelin <freebsd@grem.de>
To:        Dimitry Andric <dim@FreeBSD.org>
Cc:        "freebsd-ports@freebsd.org Ports" <freebsd-ports@freebsd.org>, Matthias Andree <mandree@FreeBSD.org>
Subject:   Re: Are ports supposed to build and run on 10-CURRENT?
Message-ID:  <20130623080905.17f77d71@bsd64.grem.de>
In-Reply-To: <C1CC40FC-4489-4164-96B7-5E1A25DCB37F@FreeBSD.org>
References:  <20130613031535.4087d7f9@bsd64.grem.de> <EF830CD7-00F1-4628-8515-76133BBE85E7@FreeBSD.org> <C1CC40FC-4489-4164-96B7-5E1A25DCB37F@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
--MP_/LfKU5HdBj6_6Cc/VQfpV0gi
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Sat, 22 Jun 2013 00:27:53 +0200
Dimitry Andric <dim@FreeBSD.org> wrote:

> On Jun 21, 2013, at 22:07, Dimitry Andric <dim@freebsd.org> wrote:
> > On Jun 13, 2013, at 03:15, Michael Gmelin <freebsd@grem.de> wrote:
> ...
> >> - system clang + std=c++11 + system libc++: Build fails, due to 
> >> a dependency (databases/db5) not building with those flags. It
> >> looks like a problem in libc++ to me, but I didn't have much time
> >> to investigate. It might be one of those things that might just go
> >> away after a while.
> > 
> > No, db5 does not build because it is redefining a C++11 standard
> > library identifier, atomic_init().  It should probably prefix all
> > its internal defines with 'db_', to avoid collisions.  The
> > db-5.3.21/src/dbinc/atomic.h file is already patched by our port to
> > avoid one such collision, but it is probably necessary to do this
> > again for any other identifiers that are used either by C++, or are
> > compiler builtins.
> 
> Attached is a diff to fix the db5 port, so it correctly builds with
> CXXFLAGS?=-std=c++11 -stdlib=libc++.  Matthias, could you please have
> a look at it?
> 
> After db5 (compiled with libc++) was installed, I retried devel/ice
> again, but the Slice/keyword test failed in exactly the same way.  If
> you comment out the "delete factoryTable" line in
> cpp/src/Ice/FactoryTableInit.cpp, the Slice/keyword test does succeed.
> 
> Many other tests after that also go well, and the next failure I get
> was:
> 
>   *** running tests 39/83
> in /usr/work/usr/ports/devel/ice/work/Ice-3.5.0/cpp/test/Ice/udp ***
> configuration: Default *** test started: 06/22/13 00:20:11
>   starting server #1... ok
>   starting server #2... Traceback (most recent call last):
>     File
> "/usr/work/usr/ports/devel/ice/work/Ice-3.5.0/cpp/test/Ice/udp/run.py",
> line 41, in <module> serverProc.append(TestUtil.startServer(server,
> "%d" % i , adapter="McastTestAdapter")) File
> "/usr/work/usr/ports/devel/ice/work/Ice-3.5.0/scripts/TestUtil.py",
> line 1396, in startServer return spawnServer(cmd, env = env, adapter
> = adapter, count = count, echo = echo,lang=config.lang,mx=config.mx)
> File
> "/usr/work/usr/ports/devel/ice/work/Ice-3.5.0/scripts/TestUtil.py",
> line 1131, in spawnServer server.expect("%s ready\n" % adapter) File
> "/usr/work/usr/ports/devel/ice/work/Ice-3.5.0/scripts/Expect.py",
> line 394, in expect raise e Expect.TIMEOUT: timeout exceeded in match
> pattern: "McastTestAdapter ready\n" buffer: "ControlAdapter ready
> Network.cpp:1701: Ice::SocketException: socket exception: Address
> already in use "
> 
>   ('test
> in /usr/work/usr/ports/devel/ice/work/Ice-3.5.0/cpp/test/Ice/udp
> failed with exit status', 256)
> 
> Whatever the source of this problem is, it is not very likely that is
> caused by a compiler or C++ library issue, but more likely some
> unexpected API change in sockets.
> 
> -Dimitry

Hi Dimitry,

I've been able to analyze issue further and developed a fix (make
factoryTable's lifetime dependant on the lifetime of generated _Init
classes). I filed a bug report upstream:

http://www.zeroc.com/forums/bug-reports/6030-ice-3-5-0-slice2cpp-generated-code-relies-static-initialization-order-crash.html#post26001

The second issue is a little bit more tricky. Since I was building the
port in a jail, the UDP based unit test didn't run and it didn't notice
it fail. Therefore it's not limited to CURRENT and, as it turns out, it
is a kernel bug.

Digging a little bit deeper into the problem I noticed that there seems
to be a multicast problem in the kernel. Usually setting SO_REUSEADDR on
a multicast address should implicitly set SO_REUSEPORT as well. Staring
at the kernel code this only seems to work on the currently created
socket, but the cached flag of already listening multicast sockets is
not retrieved correctly, which makes me believe that the cached flag is
initialized correctly when creating it. I think that this problem might
have been introduced a while ago in or after revision 227428:

http://svnweb.freebsd.org/base?view=revision&revision=227428

Which MFC'd 227207

Which says:

"Cache SO_REUSEPORT socket option in inpcb-layer in order to avoid
inp_socket->so_options dereference when we may not acquire the lock on
the inpcb."

I did some testing using a small test program that listened to a
multicast address and noticed:
- First invocation SO_REUSEADDR, second invocation SO_REUSEADDR
  => fail
- First invocation SO_REUSEADDR, second invocation SO_REUSEPORT
  => fail
- First invocation SO_REUSEPORT, second invocation SO_REUSEADDR
  => success
- First invocation SO_REUSEPORT, second invocation SO_REUSEPORT
  => success

My gut feeling tells me that the code used to set the correct values
for inp_flags2 only works when setsockopt is called after the socket has
been bound and testing verified that this is actually the case. Adding
an additional setsockopt call after calling bind makes things work as
expected. For comparison I also tested in a VM running ancient 7.x,
where things worked as expected.

So... if this was just about this port, the obvious fix would be
patching devel/ice to use SO_REUSEPORT explicitly for multicast (which
actually makes that unit test run ok), but this seems more like an
issue that really needs to be fixed on a system level. Should I open a
PR, or can you take it from here?

Cheers,
Michael

p.s. - I attached the C program I used for testing and demonstrating
the problem:

[user@bsd64 /tmp]$ cc -o multicast multicast.c
[user@bsd64 /tmp]$ ./multicast reuseaddr &
[1] 9040
[user@bsd64 /tmp]$ ./multicast reuseaddr  
binding datagram socket: Address already in use
[user@bsd64 /tmp]$ killall multicast
...
[user@bsd64 /tmp]$ ./multicast reuseport &
[1] 9093
[user@bsd64 /tmp]$ ./multicast reuseaddr &
[2] 9098
[user@bsd64 /tmp]$ ./multicast reuseaddr &
[3] 9107
[user@bsd64 /tmp]$ binding datagram socket: Address already in use
[user@bsd64 /tmp]$ killall multicast
...
[user@bsd64 /tmp]$ ./multicast reuseaddrafter &
[1] 9132
[user@bsd64 /tmp]$ ./multicast reuseaddrafter &
[2] 9135
[user@bsd64 /tmp]$ ./multicast reuseaddrafter &
[3] 9136
[user@bsd64 /tmp]$ ./multicast reuseaddrafter &
[4] 9137
[user@bsd64 /tmp]$ ./multicast reuseport &
[5] 9140
[user@bsd64 /tmp]$ ./multicast reuseaddr &
[6] 9181
[user@bsd64 /tmp]$ ./multicast reuseaddr &
[7] 9184
[user@bsd64 /tmp]$ binding datagram socket: Address already in use


-- 
Michael Gmelin

--MP_/LfKU5HdBj6_6Cc/VQfpV0gi--



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20130623080905.17f77d71>