From owner-svn-src-all@freebsd.org Tue Feb 16 00:33:31 2016 Return-Path: Delivered-To: svn-src-all@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id 67754AA868A for ; Tue, 16 Feb 2016 00:33:31 +0000 (UTC) (envelope-from ian@freebsd.org) Received: from outbound1b.ore.mailhop.org (outbound1b.ore.mailhop.org [54.200.247.200]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 460B81072 for ; Tue, 16 Feb 2016 00:33:30 +0000 (UTC) (envelope-from ian@freebsd.org) X-MHO-User: f0a3f716-d444-11e5-8dfb-c75234cc769e X-Report-Abuse-To: https://support.duocircle.com/support/solutions/articles/5000540958-duocircle-standard-smtp-abuse-information X-Originating-IP: 73.34.117.227 X-Mail-Handler: DuoCircle Outbound SMTP Received: from ilsoft.org (unknown [73.34.117.227]) by outbound1.ore.mailhop.org (Halon Mail Gateway) with ESMTPSA; Tue, 16 Feb 2016 00:33:46 +0000 (UTC) Received: from rev (rev [172.22.42.240]) by ilsoft.org (8.15.2/8.14.9) with ESMTP id u1G0XMrQ025685; Mon, 15 Feb 2016 17:33:22 -0700 (MST) (envelope-from ian@freebsd.org) Message-ID: <1455582802.12873.49.camel@freebsd.org> Subject: Re: svn commit: r295557 - head/sys/dev/uart From: Ian Lepore To: Bruce Evans Cc: Michal Meloun , src-committers@freebsd.org, Marius Strobl , svn-src-all@freebsd.org, svn-src-head@freebsd.org Date: Mon, 15 Feb 2016 17:33:22 -0700 In-Reply-To: <20160216103914.F1693@besplex.bde.org> References: <201602120514.u1C5EwWt053622@repo.freebsd.org> <20160212164755.GC4980@alchemy.franken.de> <20160213041246.V1974@besplex.bde.org> <20160213005801.GF15359@alchemy.franken.de> <56C1BDE2.8090300@freebsd.org> <20160216080249.F1233@besplex.bde.org> <1455579466.12873.23.camel@freebsd.org> <20160216103914.F1693@besplex.bde.org> Content-Type: text/plain; charset="us-ascii" X-Mailer: Evolution 3.16.5 FreeBSD GNOME Team Port Mime-Version: 1.0 Content-Transfer-Encoding: 7bit X-BeenThere: svn-src-all@freebsd.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "SVN commit messages for the entire src tree \(except for " user" and " projects" \)" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 16 Feb 2016 00:33:31 -0000 On Tue, 2016-02-16 at 11:01 +1100, Bruce Evans wrote: > On Mon, 15 Feb 2016, Ian Lepore wrote: > > > On Tue, 2016-02-16 at 09:28 +1100, Bruce Evans wrote: > >> On Mon, 15 Feb 2016, Michal Meloun wrote: > >> > >>> [...] > >>> Please note that ARM architecture does not have vectored interrupts, > >>> CPU must read actual interrupt source from external interrupt > >>> controller (GIC) register. This register contain predefined value if > >>> none of interrupts are active. > >>> > >>> 1 - CPU1: enters ns8250_bus_transmit() and sets IER_ETXRDY. > >>> 2 - HW: UART interrupt is asserted, processed by GIC and signaled > >>> to CPU2. > >>> 3 - CPU2: enters interrupt service. > >> > >> It is blocked by uart_lock(), right? > >> > >>> 4 - CPU1: writes character to into REG_DATA register. > >>> 5 - HW: UART clear its interrupt request > >>> 6 - CPU2: reads interrupt source register. No active interrupt is > >>> found, spurious interrupt is signaled, and CPU leaves interrupted > >>> state. > >>> 7 - CPU1: executes uart_barrier(). This function is not empty on ARM, > >>> and can be slow in some cases. > >> > >> It is not empty even on x86, although it probably should be. > >> > >> BTW, if arm needs the barrier, then how does it work with > >> bus_space_barrier() referenced in just 25 files in all of /sys/dev? > > > > With a hack, of course. In the arm interrupt-controller drivers we > > always call bus_space_barrier() right before doing an EOI. It's not a > > 100% solution, but in practice it seems to work pretty well. > > I thought about the x86 behaviour a bit more and now see that it does > need barriers but not the ones given by bus_space_barrier(). All (?) > interrupt handlers use mutexes (if not driver ones, then higher-level > ones). These might give stronger or different ordering than given by > bus_space_barrier(). On x86, they use the same memory bus lock as > the bus_space_barrier(). This is needed to give ordering across > CPUs. But for accessing a single device, you only need program order > for a single CPU. This is automatic on x86 provided a mutex is used > to prevent other CPUs accessing the same device. And if you don't use > a mutex, then bus_space_barrier() cannot give the necessary ordering > since if cannot prevent other CPUs interfering. > > So how does bus_space_barrier() before EOI make much difference? It > doesn't affect the order for a bunch of accesses on a single CPU. > It must do more than a mutex to do something good across CPUs. > Arguably, it is a bug in mutexes is they don't gives synchronization > for device memory. > > > ... > > The hack code does a drain-write-buffer which doesn't g'tee that the > > slow peripheral write has made it all the way to the device, but it > > does at least g'tee that the write to the bus the perhiperal is on has > > been posted and ack'd by any bus<->bus bridge, and that seems to be > > good enough in practice. (If there were multiple bridged busses > > downstream it probably wouldn't be, but so far things aren't that > > complicated inside the socs we support.) > > Hmm, so there is some automatic strong ordering but mutexes don't > work for device memory? > I guess you keep mentioning mutexes because on x86 their implementation uses some of the same instructions that are involved in bus_space barriers on x86? Otherwise I can't see what they have to do with anything related to the spurious interrupts that happen on arm. (You also mentioned multiple CPUs, which is not a requirement for this trouble on arm, it'll happen with a single core.) The piece of info you're missing might be the fact that memory-mapped device registers on arm are mapped with the Device attribute which gives stronger ordering than Normal memory. In particular, writes are in order and not combined, but they are buffered. In some designs there are multiple buffers, so there can be multiple writes that haven't reached the hardware yet. A read from the same region will stall until all writes to that region are done, and there is also an instruction that specifically forces out the buffers and stalls until they're empty. Without doing the drain-write-buffer (or a device read) after each write, the only g'tee you'd get is that each device sees the writes directed at it in the order they were issued. With devices A and B, you could write a sequence of A1 B1 A2 B2 A3 B3 and they could arrive at the devices as A1 A2 B1 B2 A3 B3, or any other permutation, as long as device A sees 123 and device B sees 123. So on arm the need for barriers arises primarily when two different devices interact with each other in some way and it matters that a series of interleaved writes reaches the devices in the same relative order they were issued by the cpu. That condition mostly comes up only in terms of the PIC interacting with basically every other device. I expect trouble to show up any time now as we start implementing DMA drivers in socs that have generic DMA engines that are only loosely coupled to the devices they're moving data for. That just seems like another place where a single driver is coordinating the actions of two different pieces of hardware that may be on different busses, and it's ripe for the lack of barriers to cause rare or intermittant failures. -- Ian