Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Nov 2014 12:44:28 -0500
From:      "Ellis H. Wilson III" <ellisw@panasas.com>
To:        Benjamin Kaduk <kaduk@MIT.EDU>
Cc:        freebsd-current@freebsd.org
Subject:   Re: WITNESS observes 2 LORs on Boot of Release 10.1
Message-ID:  <54736E7C.80105@panasas.com>
In-Reply-To: <alpine.GSO.1.10.1411221548170.19231@multics.mit.edu>
References:  <546BA9D3.6070007@panasas.com> <alpine.GSO.1.10.1411181734520.19231@multics.mit.edu> <546BF3F5.8030109@panasas.com> <546FA1DD.2070109@panasas.com> <alpine.GSO.1.10.1411221548170.19231@multics.mit.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On 11/22/2014 03:51 PM, Benjamin Kaduk wrote:
> On Fri, 21 Nov 2014, Ellis H. Wilson III wrote:
>
>> Before I start, and this is mainly geared to my responder Benjamin Kaduk,
>> based on your response, are you suggesting that the cnputc WITNESS panic you
>> expected to happen is now completely unavoidable in FreeBSD 10?  I.E., is this
>> a spinlock that WITNESS falls over each time but that is provably deadlock
>> free that the developers have decided cannot be BLESSED for some reason?
>
> https://lists.freebsd.org/pipermail/freebsd-current/2012-January/031316.html
> looks to be a better explanation than the previous link I sent ... in
> short, console output is hard.
>
>> I guess I just can't wrap my head around why we would ever move to a regime
>> where SKIPSPIN is the default for testing...  That just seems like an open
>> invitation for introducing spinlock regressions.
>
> I don't think anyone made the conscious decision to do that, it just
> happened by default as no one spent the time to fix the aforementioned
> issue.
>
>> Moving onto the LORs I'm seeing, a question I have as a newbie to WITNESS
>> debugging is how exactly to interpret the output if I see a stacktrace and
>> then a LOR output like the following:
>>
>> lock order reversal:
>>    1st 0xffffffff81633d88 entropy harvest mutex (entropy harvest mutex) @
>> /usr/src/sys/dev/random/random_harvestq.c:198
>>    2nd 0xffffffff813b6208 scrlock (scrlock) @
>> /usr/src/sys/dev/syscons/syscons.c:2682
>>
>> Does this mean WITNESS has already stored an ordering of #1 harvest_mtx then
>> #2 scp->scr_lock, and somewhere somebody tried to lock scp->scr_lock without
>> first getting harvest_mtx?  Or the reverse (WITNESS previously recorded
>> scrlock and then harvest and the lines it spit out were the offenders?)
>
> I believe it is the latter (the ordering being printed is the bad one
> which caused WITNESS to complain).

Thanks so much for the additional info Ben.  This fleshes out the 
history of this issue for me significantly.  I have filed a bug on this at:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=195262

Xin Li was able to identify the ordering that caused the problem and 
proposed a possible patch to fix it.  I can confirm that now I'm booting 
with solely WITNESS (i.e., not WITNESS_SKIPSPIN) without panic.

Thanks!

ellis



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?54736E7C.80105>