Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 7 Jul 2009 03:18:10 +0200
From:      Attilio Rao <attilio@freebsd.org>
To:        Dan Naumov <dan.naumov@gmail.com>
Cc:        FreeBSD-STABLE Mailing List <freebsd-stable@freebsd.org>
Subject:   Re: 7.2-release/amd64: panic, spin lock held too long
Message-ID:  <3bbf2fe10907061818v245abd0cgc3ca5073cb93aea4@mail.gmail.com>
In-Reply-To: <cf9b1ee00907061812r3da70018i1c8d8d12bb038a80@mail.gmail.com>
References:  <cf9b1ee00907061812r3da70018i1c8d8d12bb038a80@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help
2009/7/7 Dan Naumov <dan.naumov@gmail.com>:
> I just got a panic following by a reboot a few seconds after running
> "portsnap update", /var/log/messages shows the following:
>
> Jul  7 03:49:38 atom syslogd: kernel boot file is /boot/kernel/kernel
> Jul  7 03:49:38 atom kernel: spin lock 0xffffffff80b3edc0 (sched lock
> 1) held by 0xffffff00017d8370 (tid 100054) too long
> Jul  7 03:49:38 atom kernel: panic: spin lock held too long

That's a known bug, affecting -CURRENT as well.
The cpustop IPI is handled though an NMI, which means it could
interrupt a CPU in any moment, even while holding a spinlock,
violating one well known FreeBSD rule.
That means that the cpu can stop itself while the thread was holding
the sched lock spinlock and not releasing it (there is no way, modulo
highly hackish, to fix that).
In the while hardclock() wants to schedule something else to run and
got stuck on the thread lock.

Ideal fix would involve not using a NMI for serving the cpustop while
having a cheap way (not making the common path too hard) to tell
hardclock() to avoid scheduling while cpustop is in flight.

Thanks,
Attilio


-- 
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3bbf2fe10907061818v245abd0cgc3ca5073cb93aea4>