Skip site navigation (1)Skip section navigation (2)
Date:      Sat, 6 Sep 2014 23:15:09 -0700
From:      John-Mark Gurney <jmg@funkthat.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        "freebsd-arm@freebsd.org" <freebsd-arm@freebsd.org>, ticso@cicely.de, Ian Lepore <ian@freebsd.org>
Subject:   Re: Cubieboard: Spurious interrupt detected
Message-ID:  <20140907061508.GY82175@funkthat.com>
In-Reply-To: <AEEF4E90-27E9-43C7-8437-C593BE886846@bsdimp.com>
References:  <2279481.3MX4OEDuCl@quad> <20140905215702.GL3196@cicely7.cicely.de> <1409958716.1150.321.camel@revolution.hippie.lan> <CAJ-Vmo=EJVFqNnMo_dzevGvFWLSR6LVfYbYmOot1bLZbCvVMTQ@mail.gmail.com> <20140906011526.GT82175@funkthat.com> <1409967197.1150.339.camel@revolution.hippie.lan> <20140906045403.GU82175@funkthat.com> <CAJ-VmonPttv58SGziDda--GooyLJdCcsGXCzP-UyGkO5oO2i=Q@mail.gmail.com> <1410015268.1150.351.camel@revolution.hippie.lan> <AEEF4E90-27E9-43C7-8437-C593BE886846@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
Warner Losh wrote this message on Sat, Sep 06, 2014 at 14:20 -0600:
> 
> On Sep 6, 2014, at 8:54 AM, Ian Lepore <ian@FreeBSD.org> wrote:
> 
> > On Fri, 2014-09-05 at 23:45 -0700, Adrian Chadd wrote:
> >> The device itself may have FIFOs and internal busses that also need to
> >> be flushed.
> >> 
> > 
> > The question isn't whether or not it's sufficient, because it's
> > necessary.  The device driver knows what its hardware requirements are
> > and should meet them.  It doesn't not know what its parent bus
> > requirements are, and that's why it must call bus_space_barrier() to
> > handle architecture needs above the level of the device itself.
> 
> Yea, all that bus_space_barrier() does is say ?We?ve made sure that
> the CPU and all other bridges between the CPU and the device have
> any buffered writes pushed to the device.? If the device has additional FIFOs
> and other things, that?s 100% on the device writer.

Except that doesn't happen right now...  bus_space_barrier on amd64
does a locked stack access, but that won't cause any write buffers
on bridges to flush...

> >>> I was just looking at i386's implementation of bus_space_barrier and
> >>> it just does a stack access...  This won't be sufficient to clear any
> >>> PCI bridges that may have the write still pending...
> >> 
> >> Right. The memory barrier semantics right now don't at all guarantee
> >> that bus and device FIFOs have actually been flushed.
> >> 
> > The fact that some architectures don't implement bus_space_barrier() in
> > a way that's useful for that architecture is just a bug.  It doesn't
> > change the fact that bus_space_barrier() is currently our only defined
> > MI interface to barriers in device space.

I agree...  Now if you convince people on the correct way to fix
bus_space_barrier to do the correct thing, then I don't have a problem
having drivers use bus_space_barrier...  My only issue is that as things
are now, adding bus_space_barrier will not fix PCI devices...

> Agreed. But PCI defines that reads flush out all prior writes.
> 
> >> So I don't think doing it using the existing bus space barrier
> >> semantics is 'right'. For interrupts, it's highly likely that we do
> >> actually need device drivers to read from their interrupt register to
> >> ensure the update has been posted before returning. That's better than
> >> causing entire L2 cache flushes.
> >> 
> > 
> > Where did you see code that does an "entire L2 cache flush"?  You
> > didn't, you just saw a function name and made assumptions about what it
> > does.  The fact is, it does what is necessary for the architecture.  It
> > also happens to do what a write-then-read does on armv6, but that's
> > exactly the sort of assumption that should NOT be written into MI
> > code.  
> 
> Yea, a bus barrier just means a temporal ordering. The exact strength
> of that guarantee is a bit fuzzy, but generally it means we know things
> are done. L2 is usually not an issue, because we have the devices mapped
> uncached.
> 
> As for reading the ISR, that is device dependent. When using MSI things are
> different because the status is pushed to the host and you get the info from reading
> the host memory. Ideally, you?d want to write to ack them without having to do
> a read over PCIe, but even that?s not always required (and on such devices
> they would correct without bus barriers). Newer devices have been designed
> to avoid round-trips over the PCIe bus and having semantics that matter.
> 
> >> Question is - can we expose this somehow as a generic device method,
> >> so the higher bus layers can actually do something with it, or should
> >> we just leave it to device drivers to correctly do?
> >> 
> > 
> > In what way is that not EXACTLY whas bus_space_barrier() is defined to
> > do?
> > 
> > I've got to say, I don't understand all this pushback.  We have one
> > function defined already that's intended to be the machine-independant
> > way to ensure that any previous access to hardware using bus_space reads
> > and writes has completed, so why all this argument that it is not the
> > function to use?

The argument isn't that it's not the function to use, the argument is
that the function is broken, or doesn't do what it says in the docs...

Actually, the docs don't guarantee that the writes will be completed
after the call, the docs say:
     All of the specified type(s) of operation which are done to the region
     before the barrier operation are guaranteed to complete before any of the
     specified type(s) of operation done after the barrier.

So that means a following read will return correct results... so, per
bus_space_barrier docs, all ISRs should probably be:

	write to clear ISR
	bus_space_barrier
	read to clear ISR so that the write w/ bus_space_barrier is
	  known to complete

So, sounds like we still need the read even w/ the bus_space_barrier.

Now it could be our docs are wrong, but in that case, they should be
fixed...

> I agree with Ian here. If we?ve messed up, we need to fix that. But for the most
> part, devices that are in embedded land actually do the right thing (more often
> than not). If we?re doing something wrong on coherent architectures that accidentally
> works, we should fix that.
> 
> I may disagree with Ian about the need for some of the flushing based on the notion
> that we should fix the drivers. I feel it just makes the problem persist. Things should
> be more visibly broken. Ian thinks things should work, and I have a hard time arguing
> with that position even if I disagree :).

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20140907061508.GY82175>