Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 23 Jan 2006 20:25:11 -0600
From:      Craig Boston <craig@tobuj.gank.org>
To:        John Baldwin <jhb@freebsd.org>
Cc:        freebsd-hackers@freebsd.org, Scott Long <scottl@samsco.org>
Subject:   Re: Weird PCI interrupt delivery problem (resolution, sort of)
Message-ID:  <20060124022511.GA99552@nowhere>
In-Reply-To: <200601201542.23464.jhb@freebsd.org>
References:  <20060120014307.GA3118@nowhere> <43D07273.6030804@samsco.org> <20060120152731.GA5660@nowhere> <200601201542.23464.jhb@freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 20, 2006 at 03:42:21PM -0500, John Baldwin wrote:
> On Thu, Jan 19, 2006 at 10:17:39PM -0700, Scott Long wrote:
> > This points to a bus coherency problem.  I wonder if your BIOS is
> > incorrectly setting the memory region of the apics as cachable.  You'll
> > want to bug Baldwin about this.
> 
> Hmm, well, you can actually try the PAT patch if you are feeling brave as it 
> maps all devices (including APICs) as uncacheable.

Tried the updated PAT patch (with s/pmap_unmapbios/pmap_unmap_bios/ to
get ACPI to compile).  Unfortunately if it is a caching problem, PAT
isn't able to fix it.  Same result as stock kernel -- interrupts stop
arriving after a dozen or so.  AFAICT the local APIC is the only
memory-mapped I/O region that seems to be problematic.

Instead of writing the value twice, I also tried inserting an
__asm("nop") before the write with no effect.  Also, a single write to
an unrelated area doesn't help:

+static volatile int dummyeoi;
+
 lapic_eoi(void)
 {

+	dummyeoi = 1;
 	lapic->eoi = 0;
+	dummyeoi = 2;
 }

I'm _reasonably_ certain that marking dummyeoi volatile and leaving it
uninitialized will prevent gcc from optimizng that out.  Forcing R/W
cycles (++dummyeoi) before and after doesn't work either.

A DELAY(1) before the lapic->eoi write does the trick, but DELAY does
lots of complicated things so I don't know how useful of a data point
that is.

I'm probably missing something, but if bad cache behavior was causing
writes to the lapic EOI register to not always take effect, wouldn't the
_next_ irq (even if it's a different line) cause the one that's
currently pending to be acknowledged?

Craig



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20060124022511.GA99552>