Date: Fri, 3 Jan 2014 02:38:18 +0000 From: "Gumpula, Suresh" <Suresh.Gumpula@netapp.com> To: Alfred Perlstein <bright@mu.org>, Julian Elischer <julian@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org> Subject: RE: Reference count race window Message-ID: <D29CB80EBA4DEA4D91181928AAF51538438C0F09@SACEXCMBX04-PRD.hq.netapp.com> In-Reply-To: <52C61088.3080703@mu.org> References: <D29CB80EBA4DEA4D91181928AAF51538438C0D8B@SACEXCMBX04-PRD.hq.netapp.com> <52C5ED3E.4020805@mu.org> <52C5F8A3.9000902@freebsd.org> <D29CB80EBA4DEA4D91181928AAF51538438C0DF8@SACEXCMBX04-PRD.hq.netapp.com> <52C61088.3080703@mu.org>
next in thread | previous in thread | raw e-mail | index | archive | help
Hi Alfred, I agree that there could have been an extra/invalid crfree() which d= ecremented the count and looks valid crhold(acquire) from socket code p= anic'ed in my case. As per your suggestion if we=20 replace the assert with if condition in release, we will end up panicing= when the actual crfree() happens. But we may not be knowing who crfree'= ed() in the first and invalid place. Am I correct? I will try your sugge= stion. Can you please bit more explain your array trick ? Thanks Suresh -----Original Message----- From: owner-freebsd-hackers@freebsd.org [mailto:owner-freebsd-hackers@freeb= sd.org] On Behalf Of Alfred Perlstein Sent: Thursday, January 02, 2014 8:21 PM To: Gumpula, Suresh; Julian Elischer; freebsd-hackers@freebsd.org Subject: Re: Reference count race window On 1/2/14, 3:53 PM, Gumpula, Suresh wrote: >>> Without changing the return-value semantics of refcount_acquire, we=20 >>> have introduced a panic if we detected a race as below. >>> static __inline void >>> refcount_acquire(volatile u_int *count) { >>> u_int old; >>> >>> old =3D atomic_fetchadd_int(count, 1); >>> if (old =3D=3D 0) { >>> panic("refcount_acquire race condition detected!\n"); >>> } >>>>>> so what is the stacktrace of the panic? > It's from the socket code calling crhold. It's a non debug build( NO IN= VARIANTS ) > > #4 0xffffffff80331d34 in panic (fmt=3D0xffffffff805c1e60=20 > "refcount_acquire race condition detected!\n") at=20 > ../../../../sys/kern/kern_shutdown.c:1009 > #5 0xffffffff80326662 in refcount_acquire (count=3D<optimized out>) at=20 > ../../../../sys/sys/refcount.h:65 > #6 crhold (cr=3D<optimized out>) at=20 > ../../../../sys/kern/kern_prot.c:1814 > #7 0xffffffff803aa0d9 in socreate (dom=3D<optimized out>,=20 > aso=3D0xffffff80345c1b00, type=3D<optimized out>, proto=3D0,=20 > cred=3D0xffffff0017d7aa00, td=3D0xffffff000b294410) at=20 > ../../../../sys/kern/uipc_socket.c:441 > #8 0xffffffff803b2e5c in socket (td=3D0xffffff000b294410,=20 > uap=3D0xffffff80345c1be0) at ../../../../sys/kern/uipc_syscalls.c:201 > #9 0xffffffff80539ecb in syscall (frame=3D0xffffff80345c1c80) at=20 > ../../../../sys/amd64/amd64/trap.c:1260 > If it's a non-debug build then how do you know that someone isn't incorrect= ly lowering the refcount? Please try some invariants or at least manually turn on the one KASSERT I m= entioned. Another trick would be to add a an array of char*+int for the last few plac= es that decremented, you can use the returned refcount as an index to that = array to track who may be doing the extra frees. -Alfred _______________________________________________ freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/l= istinfo/freebsd-hackers To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D29CB80EBA4DEA4D91181928AAF51538438C0F09>