From owner-freebsd-current@FreeBSD.ORG Tue May 8 04:55:52 2012 Return-Path: Delivered-To: current@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 95A9E106566C; Tue, 8 May 2012 04:55:52 +0000 (UTC) (envelope-from jasone@FreeBSD.org) Received: from canonware.com (10140.x.rootbsd.net [204.109.63.53]) by mx1.freebsd.org (Postfix) with ESMTP id 5C3A38FC0A; Tue, 8 May 2012 04:55:52 +0000 (UTC) Received: from [192.168.168.8] (70-91-206-178-BusName-SFBA.hfc.comcastbusiness.net [70.91.206.178]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by canonware.com (Postfix) with ESMTPSA id F14BB28418; Mon, 7 May 2012 21:46:16 -0700 (PDT) Mime-Version: 1.0 (Apple Message framework v1084) Content-Type: text/plain; charset=us-ascii From: Jason Evans X-Priority: 3 (Normal) In-Reply-To: Date: Mon, 7 May 2012 21:46:16 -0700 Content-Transfer-Encoding: quoted-printable Message-Id: References: <20120421185402.GH1743@albert.catwhisker.org> <7AD8956D-AD18-4CAB-9953-06E00185A7DA@freebsd.org> To: Steve Wills X-Mailer: Apple Mail (2.1084) Cc: current@FreeBSD.org Subject: Re: : jemalloc_arena.c:182: Failed assertion: "p[i] == 0" X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 08 May 2012 04:55:52 -0000 On May 7, 2012, at 12:19 PM, Steve Wills wrote: >> On Apr 21, 2012, at 11:54 AM, David Wolfskill wrote: >>> After applying Dimitry Andric's patches to contrib/jemalloc and >>> replacing >>> /usr/bin/as with one built last Sunday, I was finally(!) able to = rebuild >>> head as of 234536: >>>=20 >>> FreeBSD freebeast.catwhisker.org 10.0-CURRENT FreeBSD 10.0-CURRENT = #797 >>> 234536M: Sat Apr 21 10:23:33 PDT 2012 >>> root@freebeast.catwhisker.org:/usr/obj/usr/src/sys/GENERIC i386 >>>=20 >>> However, as I was copying a /usr/obj hierarchy via tar -- e.g.: >>>=20 >>> root@freebeast:/common/home/david # (cd /var/tmp && rm -fr obj && = mkdir >>> obj) && (cd /usr && tar cpf - obj) | (cd /var/tmp && tar xpf -) >>>=20 >>> it ran for a while, then: >>>=20 >>> : jemalloc_arena.c:182: Failed assertion: "p[i] =3D=3D 0" >>> Abort (core dumped) >>> root@freebeast:/common/home/david # echo $? >>> 134 >>> root@freebeast:/common/home/david # ls -lTio *.core >>> ls: No match. >>> root@freebeast:/common/home/david # >>>=20 >>> So ... no core file, apparently. >>>=20 >>> freebeast(10.0-C)[2] find /usr/src/contrib/jemalloc -type f -name >>> jemalloc_arena.c >>> freebeast(10.0-C)[3] >>>=20 >>> No file named "jemalloc_arena.c", either. >>>=20 >>> But contrib/jemalloc/src/arena.c contains a function, >>> arena_chunk_validate_zeroed(): >>>=20 >>> 175 static inline void >>> 176 arena_chunk_validate_zeroed(arena_chunk_t *chunk, size_t = run_ind) >>> 177 { >>> 178 size_t i; >>> 179 UNUSED size_t *p =3D (size_t *)((uintptr_t)chunk + = (run_ind >>> << LG_PAGE)); >>> 180 >>> 181 for (i =3D 0; i < PAGE / sizeof(size_t); i++) >>> 182 assert(p[i] =3D=3D 0); >>> 183 } >>>=20 >>> Thoughts? >>=20 >> I received a similar report yesterday in the context of filezilla, = but >> didn't get as far as reproducing it. I think the problem is in >> chunk_alloc_dss(), which dangerously claims that newly allocated = memory is >> zeroed. It looks like I formalized this bad assumption in early = 2010, >> though the bug existed before that. It's a bigger deal now because = sbrk() >> is preferred over mmap(), so the bug has languished for a couple of = years. >> I'll get a fix committed today (and revert the order of preference >> between sbrk() and mmap()). >>=20 >> By the way, I wonder why not everyone hits this (I don't). >=20 > I just now hit the same issue while using ports tinderbox. It was = calling > tar during the "makeJail" tinderbox subcommand and gave the same error = as > in the subject. Funny thing is I had run the same command (on a = different > "jail") right before this and didn't get the error. What's the status = of > this? Should I set MALLOC_PRODUCTION=3Dyes in /etc/make.conf, rebuild = world > and forget about it? How recent is your system? This problem should have been fixed by = r234569, so if you're still seeing problems after that revision, there's = another problem we need to figure out. (By the way, it's possible for = an application to trigger this assertion, but unlikely.) Thanks, Jason=