Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 May 2013 15:29:46 -0600
From:      Ian Lepore <ian@FreeBSD.org>
To:        attilio@FreeBSD.org
Cc:        FreeBSD Current <freebsd-current@FreeBSD.org>, Ryan Stone <rysto32@gmail.com>, arao@freebsd.og
Subject:   Re: Incorrect comparison of ticks in deadlkres
Message-ID:  <1369776586.1258.19.camel@revolution.hippie.lan>
In-Reply-To: <CAJ-FndARggoG_scOWxzPNhJQA3foc_dW7-wtcm9b4_AG3OsVqg@mail.gmail.com>
References:  <CAFMmRNyQCs-yOB7gm4TRq3xcMp50PEJc0YNQLAjMs3q8iE-ZUw@mail.gmail.com> <CAJ-FndARggoG_scOWxzPNhJQA3foc_dW7-wtcm9b4_AG3OsVqg@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Sun, 2013-05-26 at 02:53 +0200, Attilio Rao wrote:
> On Sat, May 25, 2013 at 11:55 PM, Ryan Stone <rysto32@gmail.com> wrote:
> > Currently deadlkres performs the following comparison when trying to check
> > for threads that have been blocked on a mutex or sleeping on an sx lock:
> >
> > if (TD_ON_LOCK(td) && ticks < td->td_blktick) {
> >     /* check for deadlock...*/
> 
> Yes the check looks indeed inverted.
> 
> >
> >
> > The test against ticks is incorrect.  It results in deadlkres only
> > signaling a deadlock after ticks has rolled over; at 1000 hz this will take
> > up to 49 days.  From looking at the history of the code this test appears
> > to be a attempt to deal with ticks rollover.  However this is necessary;
> > later on the code calculates the amount of time that has passed with:
> > tticks = ticks - td->td_blktick;
> >
> > ticks was designed to exploit integer underflow in the case of rollover to
> > guarantee that subtraction produces correct results in all cases (other
> > than a double rollover, of course).  I am going to remove the two incorrect
> > tests unless somebody can point out a overflow/underflow case that I
> > haven't considered.
> 
> I'm not sure I follow what are you saying.
> Assume that when thread td goes to sleep, ticks is very close to the
> 32 bits limit. Then thread td goes to sleep and td->td_blktick is set
> to a value very close to 32 bits limits.
> After a while deadlkres thread kicks in and in the while "ticks"
> counter overflowed, rolling back to a very low value. How are you
> supposed to compute a valid value from this situation?
> I think that you need to still guard about overflow of ticks for such cases.
> 

ticks is defined as a signed integer but conceptually it is unsigned --
it increments from 0 to UINT_MAX (not INT_MAX) then rolls over.  If
td->td_blktick is captured while ticks = UINT_MAX and later ticks has
rolled over and counted back up to 15, then ticks - td->td_blktick gives
an elapsed time of 16, as it should be.  Whether exploiting this
property of signed overflow is elegant or ugly is in the eye of the
beholder. :)

If the intent of the "ticks < td->td_blktick" is to avoid the deadlock
check until "after enough time has passed," then I guess it should
probably be something more like "(ticks - td->blktick) > SOME_THRESHOLD"
so that it also uses the signed overflow trick.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1369776586.1258.19.camel>