Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 9 Jun 2011 00:48:01 +0200
From:      Marius Strobl <marius@alchemy.franken.de>
To:        Peter Jeremy <peterjeremy@acm.org>
Cc:        freebsd-sparc64@freebsd.org
Subject:   Re: 'make -j16 universe' gives SIReset
Message-ID:  <20110608224801.GB35494@alchemy.franken.de>
In-Reply-To: <20110601231237.GA5267@server.vk2pj.dyndns.org>
References:  <20110526234728.GA69750@server.vk2pj.dyndns.org> <20110527120659.GA78000@alchemy.franken.de> <20110601231237.GA5267@server.vk2pj.dyndns.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Jun 02, 2011 at 09:12:37AM +1000, Peter Jeremy wrote:
> On 2011-May-27 14:06:59 +0200, Marius Strobl <marius@alchemy.franken.de> wrote:
> >On Fri, May 27, 2011 at 09:47:28AM +1000, Peter Jeremy wrote:
> >> I tried a "make -j16 universe" using a recent 8-stable on a 16-CPU
> >> V890 and after about 11 minutes, I got the following.  This box
> 
> >It probably doesn't hurt to check the hardware with SunVTS though.
> 
> I ran a comprehensive SunVTS test for just under 24 hours without
> any errors and then checked that an unmodified kernel still SIR'd
> (which it did).
> 
> >ignores a certain amount of these). You could try whether the following
> >patch makes any difference to the SIR you're seeing:
> >http://people.freebsd.org/~marius/sparc64_intr_vector_stray.diff
> 
> Running -current, I didn't see any stray interrupts but I did get some
> "spin lock held too long" panics.  Unfortunately, I don't seem to get
> crashdumps - with DDB/KDB in the kernel, I get a hard hang - RSC
> break doesn't work and the only way out is a RSC reset.
> 
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a29749d980 (tid 100443) too long
> panic: spin lock held too long
> cpuid = 14
> KDB: stack backtrace:
> panic() at panic+0x1c8
> _mtx_lock_spin_failed() at _mtx_lock_spin_failed+0x50
> _mtx_lock_spin() at _mtx_lock_spin+0xac
> sched_add() at sched_add+0x9c
> sched_wakeup() at sched_wakeup+0x44
> setrunnable() at setrunnable+0xb8
> sleepq_resume_thread() at sleepq_resume_thread+0xfc
> sleepq_broadcast() at sleepq_broadcast+0xbc
> wakeup() at wakeup+0x18
> exit1() at exit1+0x10b0
> sys_exit() at sys_exit+0x10
> syscallenter() at syscallenter+0x2f4
> syscall() at syscall+0x30
> -- syscall (1, FreeBSD ELF64, sys_exit) %o7=0x128ad0 --
> userland() at 0x14b748
> user trace: trap %o7=0x128ad0
> pc 0x14b748, sp 0x7fdffffdb81
> 
> 
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> panic: spin lock held too long
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> cpuid = 3
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> KDB: stack backtrace:
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> pspin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> aspin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> nspin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> ispin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> cspin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> (spin lock 0xc07914d8 (sched lock) held by 0xfffff8a0168f3980 (tid 100312) too long
> ) at panic+0x1c8
> _mtx_lock_spin_failed() at _mtx_lock_spin_failed+0x50
> _thread_lock_flags() at _thread_lock_flags+0xf8
> statclock() at statclock+0x144
> handleevents() at handleevents+0x120
> timercb() at timercb+0x4c0
> tick_intr() at tick_intr+0x1fc
> -- interrupt level=0xe pil=0 %o7=0x5faa28 --
> userland() at 0x5faa40
> user trace: trap %o7=0x5faa28
> pc 0x5faa40, sp 0x7fdffffb5f1
> pc 0x5feab0, sp 0x7fdffffbf01
> pc 0x600310, sp 0x7fdffffc011
> pc 0x600628, sp 0x7fdffffc141
> pc 0x6035f4, sp 0x7fdffffc251
> pc 0x603b1c, sp 0x7fdffffc361
> pc 0x605b14, sp 0x7fdffffc471
> pc 0x5e6aa8, sp 0x7fdffffc551
> pc 0x19c594, sp 0x7fdffffc611
> pc 0x5a4aa0, sp 0x7fdffffc6d1
> pc 0x454e08, sp 0x7fdffffc791
> pc 0x4a65c0, sp 0x7fdffffc8c1
> pc 0x4a0558, sp 0x7fdffffc981
> pc 0x4a06c8, sp 0x7fdffffca41
> pc 0x1924e0, sp 0x7fdffffcb01
> pc 0x192684, sp 0x7fdffffcbc1
> pc 0x1926a4, sp 0x7fdffffcc81
> pc 0x1926a4, sp 0x7fdffffcd41
> pc 0x54c710, sp 0x7fdffffce01
> pc 0x144080, sp 0x7fdffffcec1
> pc 0x17d698, sp 0x7fdffffcf81
> pc 0x17e280, sp 0x7fdffffd041
> pc 0x148710, sp 0x7fdffffd101
> pc 0x1b9264, sp 0x7fdffffd1d1
> pc 0x1001d0, sp 0x7fdffffd291
> pc 0, sp 0x7fdffffd351
> done
> KDB: enter: panic

This might be due to the excessive use of sched_lock by SCHED_4BSD
and the MD code, f.e. more CPUs means less TLB contexts per CPU which
in turn means more flushes that are protect by sched_lock. It would
be great if the machine wouldn't lock up so you could check what
exactly is holding the mutex so long.
I think meanwhile I had a sound idea how to achieve the necessary level
of protection in the MD code using just atomic operations instead of
sched_lock, which further down would also allow the use of SCHED_ULE.
I've no idea when I'll have time to implement that though; here semester
finals are just 4 weeks away.

> 
> On -stable, I didn't see any stray interrupts but had the system hang
> several times, needing a RSC reset to recover.
> 
> >and once with:
> >	val &= ~DCR_SI;
> 
> I tried adding this and the system survived a "make -j30 universe" on
> -stable (BTW "make universe" seems to have issues cross-building x86
> derivatives).  I'm now trying that on -current.  I'm not sure what the
> implications of the above change are.
> 

What was the outcome of these tests?
When adding support for USIV+ I noticed that the dTLB parity error
reporting newly introduced with these CPUs triggered false positives
under the conditions described in cheetah_init(). By experimentation
I discovered that besides disabling that feature enabling single
issue instruction dispatch (that's what the above does) also avoids
that problem. The other mode is superscalar instruction dispatch,
which means that the CPU may issue new instructions before the
previous ones have completed, which obviously results in better
performance. Given that just disabling dTLB parity error reporting
completely avoids the problem without further side-effects at least
for USIV+ my conclusion was that the instruction dispatcher has some
bug which under certain conditions causes it to issue instructions
before those these depend on have finished. The other possibility
would be that FreeBSD misses some barriers etc but actually we rather
strictly obey these rules and if there would be an issue with these
that also should lead to problems in other cases (including when
disabling dTLB parity error reporting and leaving superscalar
dispatch enabled). Superscalar dispatch actually also dates back to
the USIII but the documented default for all US CPUs is single issue
dispatch and I currently can't remember whether the firmware of
USIII{,+,++,i} machines also enables the former.
Given that USIV+ are derived from USIV, which means they might have
inherited that bug, and the code paths leading to the false positives
are similar to those that can lead to a SIR if something unexpected
and non-correctable happens it seemed interesting to test whether
disabling superscalar dispatch also makes a difference here.

Nevertheless it also would be interesting to know if you end up
with a corrupt kernel stack with DDB, KDB and r222840 in place,
especially in case disabling superscalar dispatch doesn't solve
the problem.

Marius




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20110608224801.GB35494>