Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 05 Sep 2014 19:33:17 -0600
From:      Ian Lepore <ian@FreeBSD.org>
To:        John-Mark Gurney <jmg@funkthat.com>
Cc:        "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, ticso@cicely.de
Subject:   Re: Cubieboard: Spurious interrupt detected
Message-ID:  <1409967197.1150.339.camel@revolution.hippie.lan>
In-Reply-To: <20140906011526.GT82175@funkthat.com>
References:  <2279481.3MX4OEDuCl@quad> <20140905215702.GL3196@cicely7.cicely.de> <1409958716.1150.321.camel@revolution.hippie.lan> <CAJ-Vmo=EJVFqNnMo_dzevGvFWLSR6LVfYbYmOot1bLZbCvVMTQ@mail.gmail.com> <20140906011526.GT82175@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, 2014-09-05 at 18:15 -0700, John-Mark Gurney wrote:
> Adrian Chadd wrote this message on Fri, Sep 05, 2014 at 17:44 -0700:
> > On 5 September 2014 16:11, Ian Lepore <ian@freebsd.org> wrote:
> > > On Fri, 2014-09-05 at 23:57 +0200, Bernd Walter wrote:
> > >> On Sat, Sep 06, 2014 at 01:43:23AM +0400, Maxim V FIlimonov wrote:
> > >> > And another problem: every now and then the kernel says something like that:
> > >> > Sep  5 19:22:37  kernel: Spurious interrupt detected
> > >> > Sep  5 19:22:37  kernel: Spurious interrupt detected
> > >> > Sep  5 19:23:46  last message repeated 10 times
> > >> >
> > >> > I've heard that FreeBSD happens to do that on ARM devices. What could be the
> > >> > problem here?
> > >>
> > >> Means something generates inetrrupts, which are not handled by a driver.
> > >> Could be the cause for your load problem too.
> > >>
> > >
> > > No, that would be stray interrupts.  Spurious interrupts happen when an
> > > interrupt is asserted, but by time the processor asks the interrupt
> > > controller for the current active interrupt, it is no longer active.
> > >
> > > One way it can happen is when an interrupt handler writes to a device to
> > > clear a pending interrupt and that write takes a long time to complete
> > > because the device is on a slow bus, and the interrupt controller is on
> > > a faster bus.  The EOI to the controller outraces the device write that
> > > would clear the pending interrupt condition, so the processor is
> > > re-interrupted, but by time it asks for the next active interrupt the
> > > device write has finally completed and the interrupt is no longer
> > > pending.
> > >
> > > That sequence used to happen a lot, and it was "fixed" by adding an
> > > l2cache sync (basically a "drain write buffer") just before an EOI.  You
> > > sometimes still see an occasional spurious interrupt, but it shouldn't
> > > be happening multiple times per second as seen in the logging above.
> > 
> > Hm, interesting. I remember your discussion about it on IRC. The
> > atheros code ends up working around this in the driver by doing a read
> > from the ISR after writing out bits to clear things, so the clear is
> > flushed out.
> > 
> > I wonder if we should be asking all device drivers to be doing their
> > own ISR flushing before returning from their interrupt handlers.
> 
> This is required on PCI (that you do a read to clear the posted/pending
> write)...    So, IMO, yes, all device drivers should do the proper
> clearing of their writes to the ISR...
> 

But a driver can't assume that a read is sufficient on all architectures
it may run on.  bus_space_barrier() is the right way.  Also, it's not
just that a barrier is needed before exiting an isr... if the isr uses
locking to synchronize with hardware access by the non-isr part of the
driver, then the bus space barriers are needed in conjunction with the
locking, so that, for example, the isr's usage of the hardware is truly
complete before a lock is released.  

Scattered amongst 10 of the roughly 240 drivers in sys/dev there are 42
calls to bus_space_barrier().  Getting all the drivers fixed will be a
big job.  That's why I was thinking along the lines of an
architecture-wide workaround with potentially a way to mark a driver as
not needing the workaround once we get the fixing underway.

-- Ian





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?1409967197.1150.339.camel>