From owner-freebsd-hackers@FreeBSD.ORG Fri Mar 26 12:33:13 2010 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B75BD106564A for ; Fri, 26 Mar 2010 12:33:13 +0000 (UTC) (envelope-from psteele@maxiscale.com) Received: from server505.appriver.com (server505a.appriver.com [98.129.35.4]) by mx1.freebsd.org (Postfix) with ESMTP id 7F48D8FC12 for ; Fri, 26 Mar 2010 12:33:13 +0000 (UTC) X-Policy: GLOBAL - maxiscale.com X-Primary: psteele@maxiscale.com X-Note: This Email was scanned by AppRiver SecureTide X-ALLOW: psteele@maxiscale.com ALLOWED X-Virus-Scan: V- X-Note: Spam Tests Failed: X-Country-Path: UNITED STATES->UNITED STATES->UNITED STATES X-Note-Sending-IP: 98.129.23.15 X-Note-Reverse-DNS: ht02.exg5.exghost.com X-Note-WHTLIST: psteele@maxiscale.com X-Note: User Rule Hits: X-Note: Global Rule Hits: G179 G180 G181 G182 G186 G187 G198 G285 X-Note: Encrypt Rule Hits: X-Note: Mail Class: ALLOWEDSENDER X-Note: Headers Injected Received: from [98.129.23.15] (HELO ht02.exg5.exghost.com) by server505.appriver.com (CommuniGate Pro SMTP 5.3.2) with ESMTPS id 33996811 for freebsd-hackers@freebsd.org; Fri, 26 Mar 2010 07:33:18 -0500 Received: from mbx03.exg5.exghost.com ([169.254.1.132]) by ht02.exg5.exghost.com ([98.129.23.15]) with mapi; Fri, 26 Mar 2010 07:33:12 -0500 From: Peter Steele To: "freebsd-hackers@freebsd.org" Date: Fri, 26 Mar 2010 07:33:10 -0500 Thread-Topic: Puzzling stack trace Thread-Index: AcrM4H67fkCgixylQ2SQs/GHT5LxMw== Message-ID: <7B9397B189EB6E46A5EE7B4C8A4BB7CB3B5AACBE@MBX03.exg5.exghost.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: acceptlanguage: en-US MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Puzzling stack trace X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Mar 2010 12:33:13 -0000 I'm reposting this here since it's a pretty low-level discussion. Hopefully= someone here can explain what's going on. We had an app crash and the resulting core dump produced a very puzzling st= ack trace: #0 0x00000008011d438c in thr_kill () from /lib/libc.so.7 #1 0x00000008012722bb in abort () from /lib/libc.so.7 #2 0x00000008011fb70c in malloc_usable_size () from /lib/libc.so.7 #3 0x00000008011fbb95 in malloc_usable_size () from /lib/libc.so.7 #4 0x00000008011fdaea in _malloc_thread_cleanup () from /lib/libc.so.7 #5 0x00000008011fdc86 in _malloc_thread_cleanup () from /lib/libc.so.7 #6 0x00000008011fc8e9 in malloc_usable_size () from /lib/libc.so.7 #7 0x00000008011fccc7 in malloc_usable_size () from /lib/libc.so.7 #8 0x00000008011ffe8f in malloc () from /lib/libc.so.7 #9 0x000000080127374b in memchr () from /lib/libc.so.7 #10 0x000000080125e6e9 in __srget () from /lib/libc.so.7 #11 0x00000008012352dd in vsscanf () from /lib/libc.so.7 #12 0x0000000801220087 in fscanf () from /lib/libc.so.7 This trace resulted from a call to fscanf, as follows: char buffer[21]; fscanf(in, "%20s", buffer); We've verified that the data being read was correct, and clearly the buffer= in which fscanf is storing the string it reads is valid (i.e., it's not NU= LL). So what would lead this fscanf() call into calling abort()? Everything= seems to be in order. What's more puzzling to us is that we've looked for = calls to malloc_usable_size() in the libc sources and although the function= is defined we can find no direct call to the function in our FBSD 8 source= s: $ grep -R 'malloc_usable_size' *|grep -v .svn libc/stdlib/Symbol.map: malloc_usable_size; libc/stdlib/Makefile.inc: malloc.3 realloc.3 malloc.3 reallocf.3 mall= oc.3 malloc_usable_size.3 libc/stdlib/malloc.c:malloc_usable_size(const void *ptr) That's it. Nothing calls this function from what we can tell. Even if somet= hing did call it, we don't understand why it would call abort(). It has an = assert: malloc_usable_size(const void *ptr) { assert(ptr !=3D NULL); return (isalloc(ptr)); } but the pointer we pass to fscanf() is clearly not NULL, so what pointer wo= uld this function be testing? It's all very puzzling and we cannot reproduce this failure. We'd like to u= nderstand what happened though.