Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 5 Sep 2014 23:45:57 -0700
From:      Adrian Chadd <adrian@freebsd.org>
To:        Ian Lepore <ian@freebsd.org>, Adrian Chadd <adrian@freebsd.org>,  "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, ticso@cicely.de
Subject:   Re: Cubieboard: Spurious interrupt detected
Message-ID:  <CAJ-VmonPttv58SGziDda--GooyLJdCcsGXCzP-UyGkO5oO2i=Q@mail.gmail.com>
In-Reply-To: <20140906045403.GU82175@funkthat.com>
References:  <2279481.3MX4OEDuCl@quad> <20140905215702.GL3196@cicely7.cicely.de> <1409958716.1150.321.camel@revolution.hippie.lan> <CAJ-Vmo=EJVFqNnMo_dzevGvFWLSR6LVfYbYmOot1bLZbCvVMTQ@mail.gmail.com> <20140906011526.GT82175@funkthat.com> <1409967197.1150.339.camel@revolution.hippie.lan> <20140906045403.GU82175@funkthat.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 5 September 2014 21:54, John-Mark Gurney <jmg@funkthat.com> wrote:
> Ian Lepore wrote this message on Fri, Sep 05, 2014 at 19:33 -0600:
>> On Fri, 2014-09-05 at 18:15 -0700, John-Mark Gurney wrote:
>> > Adrian Chadd wrote this message on Fri, Sep 05, 2014 at 17:44 -0700:
>> > > On 5 September 2014 16:11, Ian Lepore <ian@freebsd.org> wrote:
>> > > > On Fri, 2014-09-05 at 23:57 +0200, Bernd Walter wrote:
>> > > >> On Sat, Sep 06, 2014 at 01:43:23AM +0400, Maxim V FIlimonov wrote:
>> > > >> > And another problem: every now and then the kernel says something like that:
>> > > >> > Sep  5 19:22:37  kernel: Spurious interrupt detected
>> > > >> > Sep  5 19:22:37  kernel: Spurious interrupt detected
>> > > >> > Sep  5 19:23:46  last message repeated 10 times
>> > > >> >
>> > > >> > I've heard that FreeBSD happens to do that on ARM devices. What could be the
>> > > >> > problem here?
>> > > >>
>> > > >> Means something generates inetrrupts, which are not handled by a driver.
>> > > >> Could be the cause for your load problem too.
>> > > >>
>> > > >
>> > > > No, that would be stray interrupts.  Spurious interrupts happen when an
>> > > > interrupt is asserted, but by time the processor asks the interrupt
>> > > > controller for the current active interrupt, it is no longer active.
>> > > >
>> > > > One way it can happen is when an interrupt handler writes to a device to
>> > > > clear a pending interrupt and that write takes a long time to complete
>> > > > because the device is on a slow bus, and the interrupt controller is on
>> > > > a faster bus.  The EOI to the controller outraces the device write that
>> > > > would clear the pending interrupt condition, so the processor is
>> > > > re-interrupted, but by time it asks for the next active interrupt the
>> > > > device write has finally completed and the interrupt is no longer
>> > > > pending.
>> > > >
>> > > > That sequence used to happen a lot, and it was "fixed" by adding an
>> > > > l2cache sync (basically a "drain write buffer") just before an EOI.  You
>> > > > sometimes still see an occasional spurious interrupt, but it shouldn't
>> > > > be happening multiple times per second as seen in the logging above.
>> > >
>> > > Hm, interesting. I remember your discussion about it on IRC. The
>> > > atheros code ends up working around this in the driver by doing a read
>> > > from the ISR after writing out bits to clear things, so the clear is
>> > > flushed out.
>> > >
>> > > I wonder if we should be asking all device drivers to be doing their
>> > > own ISR flushing before returning from their interrupt handlers.
>> >
>> > This is required on PCI (that you do a read to clear the posted/pending
>> > write)...    So, IMO, yes, all device drivers should do the proper
>> > clearing of their writes to the ISR...
>> >
>>
>> But a driver can't assume that a read is sufficient on all architectures
>> it may run on.  bus_space_barrier() is the right way.  Also, it's not
>
> Except that I don't think even on PCI a bus_space_barrier is sufficient...

It isn't.

The device itself may have FIFOs and internal busses that also need to
be flushed.

> I was just looking at i386's implementation of bus_space_barrier and
> it just does a stack access...  This won't be sufficient to clear any
> PCI bridges that may have the write still pending...

Right. The memory barrier semantics right now don't at all guarantee
that bus and device FIFOs have actually been flushed.

So I don't think doing it using the existing bus space barrier
semantics is 'right'. For interrupts, it's highly likely that we do
actually need device drivers to read from their interrupt register to
ensure the update has been posted before returning. That's better than
causing entire L2 cache flushes.

Question is - can we expose this somehow as a generic device method,
so the higher bus layers can actually do something with it, or should
we just leave it to device drivers to correctly do?

(Also - do any of the freebsd device driver books or the handbook mention this?)



-a



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-VmonPttv58SGziDda--GooyLJdCcsGXCzP-UyGkO5oO2i=Q>