Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 30 Apr 2012 12:34:32 -0700
From:      Jason Evans <jasone@freebsd.org>
To:        =?iso-8859-1?Q?Gustau_P=E9rez_i_Querol?= <gperez@entel.upc.edu>
Cc:        avilla@freebsd.org, FreeBSD current <freebsd-current@freebsd.org>
Subject:   Re: RFC: jemalloc: qdbus sigsegv in malloc_init
Message-ID:  <2D080258-652B-4EFA-8F6F-6ECA3CA4404B@freebsd.org>
In-Reply-To: <4F9E9E06.4070004@entel.upc.edu>
References:  <4F9E9E06.4070004@entel.upc.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Apr 30, 2012, at 7:13 AM, Gustau P=E9rez i Querol wrote:
>  the kde team is seeing some strange problems with the new version =
(4.8.1) of devel/dbus-qt4 with current. It does work with stable. I also =
suspect that the problem described below is affecting the experimental =
cinnamon port (an alternative to gnome3, possible replacement of =
gnome2).
>=20
>  The problem happens with both i386 and amd64 with empty =
/etc/malloc.conf and simple /etc/make.conf. Everything compiled with =
base gcc (no clang). The kernel was compiled with no debug support, but =
it can enable if needed. There are reports from avilla@freebsd.org of =
the same behavior with clang compiled world and kernel and with   =
MALLOC_PRODUCTION=3Dyes.
>=20
> When qdbus starts, it segfauts. The backtrace of the problem with =
r234769 can be found here: http://pastebin.com/ryBXtqGF. When starting =
the qdbus daemon by hand in a X+twm session, we see it calls calloc many =
times and after a fixed number of times segfaults. We see it segfaults =
at rb_gen (a quite large macro defined at =
$SRC_BASE/contrib/jemalloc/include/jemalloc/internal/rb.h).
>=20
> If the daemon is started by hand, I'm able to skip all the calls qdbus =
makes to calloc till the one causing the segfault. At that point, at =
rb_gen, we don't exactly know what is going on or how to debug the =
macro. Ktrace are available, but we were unable to find anything new =
from them.
>=20
>  With old versions of current before the jemalloc imports (as of March =
30th) the daemon segfaulted at malloc.c:2426. With revisions during =
April 20 to 24th (can be more precise, it was during the jemalloc =
imports) the daemon segfaulted at malloc_init. Bts are available if =
needed, and if necessary I can go back to those revision and recompile =
world+kernel to see its behavior.
>=20
>  Any help from freebsd-current@ (perhaps Jason Evans can help us) will =
be appreciated. Any additional info, like source revisions, can be =
provided. I would like to stress that the experimental devel/dbus-qt4 =
works fine with recent stable.

The crash is happening in page run management, so there is some pretty =
bad memory corruption going on by the time of the crash.  If I =
understand you correctly, you have reproduced the crash on a system that =
does *not* have MALLOC_PRODUCTION defined, which means that none of the =
assertions in jemalloc caught the problem.

Adrian Chadd made the excellent suggestion of trying valgrind; it's =
likely to point out the problem almost immediately.  If that doesn't =
work, the utrace functionality in malloc may help you figure out what =
activity has occurred by the time of the crash, and give you a better =
understanding of what happened to memory around the address that is =
involved in the crash.

Jason=



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2D080258-652B-4EFA-8F6F-6ECA3CA4404B>