From owner-freebsd-arm@FreeBSD.ORG Sat Sep 6 04:54:15 2014 Return-Path: Delivered-To: freebsd-arm@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 9600533F; Sat, 6 Sep 2014 04:54:15 +0000 (UTC) Received: from h2.funkthat.com (gate2.funkthat.com [208.87.223.18]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "funkthat.com", Issuer "funkthat.com" (not verified)) by mx1.freebsd.org (Postfix) with ESMTPS id 667FC1079; Sat, 6 Sep 2014 04:54:15 +0000 (UTC) Received: from h2.funkthat.com (localhost [127.0.0.1]) by h2.funkthat.com (8.14.3/8.14.3) with ESMTP id s864s4cI030011 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 5 Sep 2014 21:54:04 -0700 (PDT) (envelope-from jmg@h2.funkthat.com) Received: (from jmg@localhost) by h2.funkthat.com (8.14.3/8.14.3/Submit) id s864s3TA030010; Fri, 5 Sep 2014 21:54:03 -0700 (PDT) (envelope-from jmg) Date: Fri, 5 Sep 2014 21:54:03 -0700 From: John-Mark Gurney To: Ian Lepore Subject: Re: Cubieboard: Spurious interrupt detected Message-ID: <20140906045403.GU82175@funkthat.com> Mail-Followup-To: Ian Lepore , Adrian Chadd , "freebsd-arm@freebsd.org" , ticso@cicely.de References: <2279481.3MX4OEDuCl@quad> <20140905215702.GL3196@cicely7.cicely.de> <1409958716.1150.321.camel@revolution.hippie.lan> <20140906011526.GT82175@funkthat.com> <1409967197.1150.339.camel@revolution.hippie.lan> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1409967197.1150.339.camel@revolution.hippie.lan> User-Agent: Mutt/1.4.2.3i X-Operating-System: FreeBSD 7.2-RELEASE i386 X-PGP-Fingerprint: 54BA 873B 6515 3F10 9E88 9322 9CB1 8F74 6D3F A396 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-TipJar: bitcoin:13Qmb6AeTgQecazTWph4XasEsP7nGRbAPE X-to-the-FBI-CIA-and-NSA: HI! HOW YA DOIN? can i haz chizburger? X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.2.2 (h2.funkthat.com [127.0.0.1]); Fri, 05 Sep 2014 21:54:04 -0700 (PDT) Cc: "freebsd-arm@freebsd.org" , ticso@cicely.de X-BeenThere: freebsd-arm@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: "Porting FreeBSD to ARM processors." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 06 Sep 2014 04:54:15 -0000 Ian Lepore wrote this message on Fri, Sep 05, 2014 at 19:33 -0600: > On Fri, 2014-09-05 at 18:15 -0700, John-Mark Gurney wrote: > > Adrian Chadd wrote this message on Fri, Sep 05, 2014 at 17:44 -0700: > > > On 5 September 2014 16:11, Ian Lepore wrote: > > > > On Fri, 2014-09-05 at 23:57 +0200, Bernd Walter wrote: > > > >> On Sat, Sep 06, 2014 at 01:43:23AM +0400, Maxim V FIlimonov wrote: > > > >> > And another problem: every now and then the kernel says something like that: > > > >> > Sep 5 19:22:37 kernel: Spurious interrupt detected > > > >> > Sep 5 19:22:37 kernel: Spurious interrupt detected > > > >> > Sep 5 19:23:46 last message repeated 10 times > > > >> > > > > >> > I've heard that FreeBSD happens to do that on ARM devices. What could be the > > > >> > problem here? > > > >> > > > >> Means something generates inetrrupts, which are not handled by a driver. > > > >> Could be the cause for your load problem too. > > > >> > > > > > > > > No, that would be stray interrupts. Spurious interrupts happen when an > > > > interrupt is asserted, but by time the processor asks the interrupt > > > > controller for the current active interrupt, it is no longer active. > > > > > > > > One way it can happen is when an interrupt handler writes to a device to > > > > clear a pending interrupt and that write takes a long time to complete > > > > because the device is on a slow bus, and the interrupt controller is on > > > > a faster bus. The EOI to the controller outraces the device write that > > > > would clear the pending interrupt condition, so the processor is > > > > re-interrupted, but by time it asks for the next active interrupt the > > > > device write has finally completed and the interrupt is no longer > > > > pending. > > > > > > > > That sequence used to happen a lot, and it was "fixed" by adding an > > > > l2cache sync (basically a "drain write buffer") just before an EOI. You > > > > sometimes still see an occasional spurious interrupt, but it shouldn't > > > > be happening multiple times per second as seen in the logging above. > > > > > > Hm, interesting. I remember your discussion about it on IRC. The > > > atheros code ends up working around this in the driver by doing a read > > > from the ISR after writing out bits to clear things, so the clear is > > > flushed out. > > > > > > I wonder if we should be asking all device drivers to be doing their > > > own ISR flushing before returning from their interrupt handlers. > > > > This is required on PCI (that you do a read to clear the posted/pending > > write)... So, IMO, yes, all device drivers should do the proper > > clearing of their writes to the ISR... > > > > But a driver can't assume that a read is sufficient on all architectures > it may run on. bus_space_barrier() is the right way. Also, it's not Except that I don't think even on PCI a bus_space_barrier is sufficient... I was just looking at i386's implementation of bus_space_barrier and it just does a stack access... This won't be sufficient to clear any PCI bridges that may have the write still pending... There's also the issue that if __GNUCLIKE_ASM is not defined, the code will compile w/o ANY barrier, not even a compiler_membar... We should probably add a #else #error please add your compilers equivalent... > just that a barrier is needed before exiting an isr... if the isr uses > locking to synchronize with hardware access by the non-isr part of the > driver, then the bus space barriers are needed in conjunction with the > locking, so that, for example, the isr's usage of the hardware is truly > complete before a lock is released. > > Scattered amongst 10 of the roughly 240 drivers in sys/dev there are 42 > calls to bus_space_barrier(). Getting all the drivers fixed will be a > big job. That's why I was thinking along the lines of an > architecture-wide workaround with potentially a way to mark a driver as > not needing the workaround once we get the fixing underway. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."