From owner-freebsd-hackers@FreeBSD.ORG Fri Jan 3 02:53:23 2014 Return-Path: Delivered-To: freebsd-hackers@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id B981932B; Fri, 3 Jan 2014 02:53:23 +0000 (UTC) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.freebsd.org (Postfix) with ESMTP id 97C091D8C; Fri, 3 Jan 2014 02:53:23 +0000 (UTC) Received: from Alfreds-MacBook-Pro.local (50-204-88-5-static.hfc.comcastbusiness.net [50.204.88.5]) by elvis.mu.org (Postfix) with ESMTPSA id 66D1F1A3C19; Thu, 2 Jan 2014 18:53:12 -0800 (PST) Message-ID: <52C62617.7020304@mu.org> Date: Thu, 02 Jan 2014 18:53:11 -0800 From: Alfred Perlstein User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: "Gumpula, Suresh" , Julian Elischer , "freebsd-hackers@freebsd.org" Subject: Re: Reference count race window References: <52C5ED3E.4020805@mu.org> <52C5F8A3.9000902@freebsd.org> <52C61088.3080703@mu.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: freebsd-hackers@freebsd.org X-Mailman-Version: 2.1.17 Precedence: list List-Id: Technical Discussions relating to FreeBSD List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 03 Jan 2014 02:53:23 -0000 On 1/2/14, 6:38 PM, Gumpula, Suresh wrote: > Hi Alfred, > I agree that there could have been an extra/invalid crfree() which decremented the count and looks valid crhold(acquire) from socket code panic'ed in my case. As per your suggestion if we > replace the assert with if condition in release, we will end up panicing when the actual crfree() happens. But we may not be knowing who crfree'ed() in the first and invalid place. Am I correct? I will try your suggestion. > > Can you please bit more explain your array trick ? I think the simplest thing to do would be to replace crfree with a macro to pass __FILE__ and __LINE__ down into the actual crfree (or you can use builtin_return_address to get the stack address of the caller instead. Then just either have an array of {file,line} tuples or instead just return addresses in the struct. Just extend struct ucred and add this: #define MAX_PREV 10 const char *files[MAX_PREV]; int lines[MAX_PREV]; Then hack your own version of refcount_release(), call it refcount_release2() but have it take a pointer to an integer that it will write the value of the old refcount into. Then you can use that return value like so: if (old_refcount < MAX_PREV) { cred->files[old_refcount] = pointer_to_file_name; cred->lines[old_refcount] = line_number; } Then add the assert. or better yet, turn on INVARIANTS (or maybe try each option in turn as INVARIANTS might hide the bug). When you crash you can then see the last few callers who did free inside the struct. Since the refcount "old_refcount" is atomically manipulated you should see the last few frees that send you negative. -Alfred > > Thanks > Suresh > > > > -----Original Message----- > From: owner-freebsd-hackers@freebsd.org [mailto:owner-freebsd-hackers@freebsd.org] On Behalf Of Alfred Perlstein > Sent: Thursday, January 02, 2014 8:21 PM > To: Gumpula, Suresh; Julian Elischer; freebsd-hackers@freebsd.org > Subject: Re: Reference count race window > > > On 1/2/14, 3:53 PM, Gumpula, Suresh wrote: >>>> Without changing the return-value semantics of refcount_acquire, we >>>> have introduced a panic if we detected a race as below. >>>> static __inline void >>>> refcount_acquire(volatile u_int *count) { >>>> u_int old; >>>> >>>> old = atomic_fetchadd_int(count, 1); >>>> if (old == 0) { >>>> panic("refcount_acquire race condition detected!\n"); >>>> } >>>>>>> so what is the stacktrace of the panic? >> It's from the socket code calling crhold. It's a non debug build( NO INVARIANTS ) >> >> #4 0xffffffff80331d34 in panic (fmt=0xffffffff805c1e60 >> "refcount_acquire race condition detected!\n") at >> ../../../../sys/kern/kern_shutdown.c:1009 >> #5 0xffffffff80326662 in refcount_acquire (count=) at >> ../../../../sys/sys/refcount.h:65 >> #6 crhold (cr=) at >> ../../../../sys/kern/kern_prot.c:1814 >> #7 0xffffffff803aa0d9 in socreate (dom=, >> aso=0xffffff80345c1b00, type=, proto=0, >> cred=0xffffff0017d7aa00, td=0xffffff000b294410) at >> ../../../../sys/kern/uipc_socket.c:441 >> #8 0xffffffff803b2e5c in socket (td=0xffffff000b294410, >> uap=0xffffff80345c1be0) at ../../../../sys/kern/uipc_syscalls.c:201 >> #9 0xffffffff80539ecb in syscall (frame=0xffffff80345c1c80) at >> ../../../../sys/amd64/amd64/trap.c:1260 >> > If it's a non-debug build then how do you know that someone isn't incorrectly lowering the refcount? > > Please try some invariants or at least manually turn on the one KASSERT I mentioned. > > Another trick would be to add a an array of char*+int for the last few places that decremented, you can use the returned refcount as an index to that array to track who may be doing the extra frees. > > -Alfred > > _______________________________________________ > freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-hackers > To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org" >