Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 3 Jan 2014 02:38:18 +0000
From:      "Gumpula, Suresh" <Suresh.Gumpula@netapp.com>
To:        Alfred Perlstein <bright@mu.org>, Julian Elischer <julian@freebsd.org>, "freebsd-hackers@freebsd.org" <freebsd-hackers@freebsd.org>
Subject:   RE: Reference count race window
Message-ID:  <D29CB80EBA4DEA4D91181928AAF51538438C0F09@SACEXCMBX04-PRD.hq.netapp.com>
In-Reply-To: <52C61088.3080703@mu.org>
References:  <D29CB80EBA4DEA4D91181928AAF51538438C0D8B@SACEXCMBX04-PRD.hq.netapp.com> <52C5ED3E.4020805@mu.org> <52C5F8A3.9000902@freebsd.org> <D29CB80EBA4DEA4D91181928AAF51538438C0DF8@SACEXCMBX04-PRD.hq.netapp.com> <52C61088.3080703@mu.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Hi Alfred,
      I agree that there could have been an extra/invalid  crfree() which d=
ecremented the count  and  looks valid  crhold(acquire) from socket code  p=
anic'ed in my case. As per your suggestion if we=20
 replace  the assert with if condition in release,  we will end up panicing=
  when the actual  crfree() happens.  But we may not be knowing who crfree'=
ed() in the first and invalid place.  Am I correct?   I will try your sugge=
stion.

Can you please bit more explain your array trick ?

Thanks
Suresh



-----Original Message-----
From: owner-freebsd-hackers@freebsd.org [mailto:owner-freebsd-hackers@freeb=
sd.org] On Behalf Of Alfred Perlstein
Sent: Thursday, January 02, 2014 8:21 PM
To: Gumpula, Suresh; Julian Elischer; freebsd-hackers@freebsd.org
Subject: Re: Reference count race window


On 1/2/14, 3:53 PM, Gumpula, Suresh wrote:
>>> Without changing the return-value semantics of refcount_acquire, we=20
>>> have introduced a panic if we detected a race as below.
>>> static __inline void
>>> refcount_acquire(volatile u_int *count) {
>>>           u_int old;
>>>
>>>           old =3D atomic_fetchadd_int(count, 1);
>>>           if (old =3D=3D 0) {
>>>             panic("refcount_acquire race condition detected!\n");
>>>           }
>>>>>> so what is the stacktrace of the panic?
> It's from the socket code calling crhold.   It's a non debug build( NO IN=
VARIANTS )
>
> #4  0xffffffff80331d34 in panic (fmt=3D0xffffffff805c1e60=20
> "refcount_acquire race condition detected!\n") at=20
> ../../../../sys/kern/kern_shutdown.c:1009
> #5  0xffffffff80326662 in refcount_acquire (count=3D<optimized out>) at=20
> ../../../../sys/sys/refcount.h:65
> #6  crhold (cr=3D<optimized out>) at=20
> ../../../../sys/kern/kern_prot.c:1814
> #7  0xffffffff803aa0d9 in socreate (dom=3D<optimized out>,=20
> aso=3D0xffffff80345c1b00, type=3D<optimized out>, proto=3D0,=20
> cred=3D0xffffff0017d7aa00, td=3D0xffffff000b294410) at=20
> ../../../../sys/kern/uipc_socket.c:441
> #8  0xffffffff803b2e5c in socket (td=3D0xffffff000b294410,=20
> uap=3D0xffffff80345c1be0) at ../../../../sys/kern/uipc_syscalls.c:201
> #9  0xffffffff80539ecb in syscall (frame=3D0xffffff80345c1c80) at=20
> ../../../../sys/amd64/amd64/trap.c:1260
>
If it's a non-debug build then how do you know that someone isn't incorrect=
ly lowering the refcount?

Please try some invariants or at least manually turn on the one KASSERT I m=
entioned.

Another trick would be to add a an array of char*+int for the last few plac=
es that decremented, you can use the returned refcount as an index to that =
array to track who may be doing the extra frees.

-Alfred

_______________________________________________
freebsd-hackers@freebsd.org mailing list http://lists.freebsd.org/mailman/l=
istinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?D29CB80EBA4DEA4D91181928AAF51538438C0F09>