From owner-freebsd-arch@freebsd.org Sun Jul 31 18:43:26 2016 Return-Path: Delivered-To: freebsd-arch@mailman.ysv.freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:1900:2254:206a::19:1]) by mailman.ysv.freebsd.org (Postfix) with ESMTP id CF766BAA253; Sun, 31 Jul 2016 18:43:26 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from c.mail.sonic.net (c.mail.sonic.net [64.142.111.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id B0C6E173F; Sun, 31 Jul 2016 18:43:26 +0000 (UTC) (envelope-from nwhitehorn@freebsd.org) Received: from zeppelin.tachypleus.net (75-101-50-44.static.sonic.net [75.101.50.44]) (authenticated bits=0) by c.mail.sonic.net (8.15.1/8.15.1) with ESMTPSA id u6VIhN9G004097 (version=TLSv1.2 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 31 Jul 2016 11:43:23 -0700 Subject: Re: INTRNG (Was: svn commit: r301453....) To: Michal Meloun References: <201606051620.u55GKD5S066398@repo.freebsd.org> <57950005.6070403@FreeBSD.org> <57961549.4020105@FreeBSD.org> <57976867.6080705@FreeBSD.org> <5798E104.5020104@FreeBSD.org> <579A25BB.8070206@FreeBSD.org> <30790e40-58b4-3371-c0f0-b7545571f389@freebsd.org> <579AFFC5.1040005@FreeBSD.org> <579CD355.1050203@FreeBSD.org> <460fa0b3-ddb7-6247-2412-3d75a589d5e7@freebsd.org> <579CF7C8.1040302@FreeBSD.org> <24107713-6d50-c21d-ccf1-7dbdb36cc484@freebsd.org> <579E1BE2.7020500@FreeBSD.org> Cc: "freebsd-arm@freebsd.org" , Svatopluk Kraus , "freebsd-arch@freebsd.org" From: Nathan Whitehorn Message-ID: <7f053bb8-ab03-e46c-1c72-d757348e4e54@freebsd.org> Date: Sun, 31 Jul 2016 11:43:23 -0700 User-Agent: Mozilla/5.0 (X11; FreeBSD amd64; rv:45.0) Gecko/20100101 Thunderbird/45.1.0 MIME-Version: 1.0 In-Reply-To: <579E1BE2.7020500@FreeBSD.org> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Sonic-CAuth: UmFuZG9tSVZvVQZDE005RznR1LrYjBmC0ls8/5+kiod/s4Zcd7U+W9XBY02KiwpryW8xl2M4AQJAJYZCOuOJsr3Nh3yxbD936UULR3sDrWw= X-Sonic-ID: C;fGfMqE5X5hG4haDx2xNB0g== M;uHosqU5X5hG4haDx2xNB0g== X-Spam-Flag: No X-Sonic-Spam-Details: 0.0/5.0 by cerberusd X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 31 Jul 2016 18:43:26 -0000 On 07/31/16 08:40, Michal Meloun wrote: > Dne 31.07.2016 v 8:11 Nathan Whitehorn napsal(a): > [snip] >>> The current API is certainly a bit of a hack and I would be pleased to >>>> see it go; but the replacement needs to be more functional and more >>>> robust. As far as I can tell, I literally *cannot* move PowerPC over >>>> to this architecture. >>> Yes, at this time, I agree. If you can accept enhancement of >>> (owf_)bus_map_intr() then we can move to next, significantly less >>> hackish, state. >> OK, sure. I wrote some code this afternoon (attached) to sketch this. >> It does not compile or run or anything useful, but it should >> illustrate the important parts of what I think is an API that should >> work for everyone. I'm happy to finish it if it looks reasonable -- I >> may in any case just to replace PowerPC's increasingly teetering >> intr_machdep.c. >> >> The OF part is essentially unchanged from how it currently works: >> there's a method that you pass the interrupt specifier and interrupt >> parent to, and it spits back an IRQ that means that combination of >> things in perpetuity. OFW_BUS_MAP_INTR() in nexus.c, with its current >> API, can be implemented in terms of it in one line. Whenever the >> relevant PIC attaches, it is told to use the given IRQ to refer to >> that opaque bytestring. >> >> I've added a set of equivalent functions for ACPI that take the >> contents of an ACPI interrupt specifier and do the same things. It >> tries to have the IRQ be human-meaningful, if possible, by matching >> the ACPI interrupt number. Having separate functions seemed a little >> cleaner than exposing the enums and unions as part of the KPI, but I'm >> not wedded to it -- this is just an example. >> >> PICs register (and deregister) themselves with either an OF phandle or >> an ACPI resource string or (god help us) both as needed. The PICs have >> corresponding methods for interpreting various kinds of interrupt >> specifiers. >> >> Whether it allows bus_alloc_resource(), bus_setup_intr(), etc. to >> succeed before the PICs are attached is gated on an #ifdef. That can >> probably be off by default on ARM. A PowerPC version of this code >> would have to keep it on until various bus drivers are fixed. >> >> Here's the general flow: >> - Parent bus enumerates children, maps IRQs as needed, adds to >> resource list. The struct resources involved aren't special (just a >> single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are trivial to >> implement (and already are implemented, in general, for OF/FDT drivers). >> - bus_alloc_resource() does nothing special >> - bus_setup_intr() calls into the PIC and does what is needed, as in >> r301453 (to within that #ifdef) >> >> This should keep all the best pieces of everything: >> - ACPI and OF are supported together, and easy to extend for other >> types of mapping technologies without breaking either KBI or KPI (just >> add new functions) >> - No APIs change for existing Open Firmware code >> - The support code can live entirely in machine-dependent directories, >> with no code changes at all required in kern/ or dev/ofw/ >> - bus_setup_intr() and friends behave as in r301453 and succeed or >> fail for real >> - PCI code is not an issue >> - No disconnected interrupt state maintained in mapping table (unless >> the transitional #ifdef is on) and the mapping table content is only >> larger than r301453 by having a copy of the original interrupt specifier. >> >> If and when we manage to fix the kernel APIs to allow non-scalar >> interrupt specifiers, this should also be easy to gradually >> transmogrify to support that case since there exists a 1:1 mapping of >> scalar IRQs to rich data and the control flow would be identical: >> interrupt specifiers are read at enumeration time and a resulting >> handle -- struct resource or scalar IRQ -- used afterward to refer to it. >> > Nice, nice, i think that we are very close at this point. > The problem with the above workflow is that maps IRQ function is called > at parent bus enumeration time, generally at BUS_PASS_BUS pass. And at > this time, PICs are not exist. > Also, 'static struct intr_irqsrc *irq_sources[NIRQ];' is not a mapping > table, it doesn't provide this functionality. > But yes, i understand that mapping table is important and necessary for you. > > So, if i can extend our concept a little bit, then: > We can add new table (map_data_tbl for example) that holds a copy of > the original interrupt specifier, index to irq_sources table and > probably some flags. > And we can modify the above workflow to: > - Parent bus enumerates children, allocates entries from map_data_tbl, > adds to resource list. The struct resources involved aren't special > (just a single number), so PCIB_ROUTE_INTERRUPT(), MSIs, etc. are > trivial to implement (and already are implemented, in general, for > OF/FDT drivers). Index to map_data_tbl is used as resource ID. > - bus_alloc_resource() sets 'ALLOCATED' in map_data_tbl flags field, > *maps IRQs* and stores result in 'index' field. > - bus_setup_intr() sets 'SETUP_DONE' in map_data_tbl flags field, calls > into the PIC and does what is needed, as in r301453. (to within that > #ifdef) > > And, in PPC case, newly attached PIC scans whole map_data_tbl table, > finds his entries and makes what needs. > > Does that make sense? > I hope that postponing of map IRQ call is not a problem for PPC, > everything else looks easy. > Moreover, this additional level of indirection also solves all my > 'hypothetical corner cases' with N:1 mappings (between original > interrupt specifier and real interrupt number). Yes, I think we are converging. My qualm about bus_alloc_resource() is twofold: 1. Traditionally, with interrupts, bus_alloc_resource() has no side effects and I'm not sure it propagates cleanly down the tree in all cases. 2. T. , x n nn b n /here are a few bus APIs (bus_config_intr() comes to mind) that use raw IRQ IDs and, so far as I know, can be, and sometimes are called before bus_alloc_resource(). If the PIC doesn't know about the IRQ yet when bus_config_intr() (etc.) is called, things will just break. So you would need to make sure that any bus method handling a resource ID causes it to be mapped on the PIC at that time. It's OK if that happens in bus_alloc_resource() so long as bus_config_intr(), bus_setup_intr(), etc. cause that to happen if it hasn't happened yet -- I don't care when these calls are made to the PIC driver so long as *what* calls will be made is set at enumeration time. I am also totally fine with having, on ARM, bus_config_intr(), bus_setup_intr() etc. fail if the PIC hasn't attached yet (On PowerPC, we can't do that yet, but after this conversation, I regard that as a bug and would fix that later), as well as delaying setup on the PIC to the first time any bus driver actually tries to *use* the interrupt (alloc_resource(), setup_intr(), config_intr(), whatever) rather than when the interrupt is originally allocated. ------------ The following is a large parenthesis ------------------- One other possible route here that would also work well would be to make ofwbus.c MD (it's a trivial piece of code anyway, so we don't gain a lot by sharing it) and implement ofw_bus_map_intr() locally as an ofwbus bus method. Then you could have the mapping table stored in ofwbus's softc -- the API was designed for this initially. You would need MD extensions for doing PIC registration there (which is fine), but that segregates all the OFW-specific information into OFW-specific code and would let the bus methods and the OFW interrupt mapping table interact cleanly in the same place. This still preserves the pre-r301453 API, makes PowerPC work, and maybe address's skra@'s concern about extensibility and letting core interrupt code know about FDT (or ACPI). I'd be happy to mock this up as well if you think it's a good approach. So you would have something in arm/arm/ofwbus.c like (pseudo-code, missing unlocks, etc.): struct ofwbus_softc { (list of PICs) (list of IRQ mapping from # to specifier, iparent, and device_t) } static int ofw_bus_map_if_unmapped(struct ofwbus_softc *sc, int irq) { struct ofw_bus_irq_mapping *i; struct ofw_bus_pic *j; int err = 0; mtx_lock(sc->ofw_bus_irqmap_lock); LIST_FOREACH(i, sc->irq_mappings) { if (i->irq == irq) break; } if (i == NULL) return (ENXIO); /* PANIC? */ if (i->dev != NULL) return (0); /* Mapped already */ LIST_FOREACH(j, sc->ofw_bus_registered_pics) { if (j->phandle == i->phandle) { arm_intr_set_pic(irq, j->dev); err = PIC_SET_OFW_ISPEC(j->dev, i->ispec, i->icells); return (err); } } return (ENXIO); /* No matching PIC. Panic? */ } static int ofw_bus_setup_intr(device_t dev, device_t child, int irq, blah...) { int err; /* This section at the beginning of anything allocating/using/modifying interrupts */ err = ofw_bus_map_if_unmapped(sc, irq); if (err != 0) /* Explode if the PIC isn't online yet return (err); bus_setup_intr(dev, child, irq, blah...); /* onward to nexus */ } This has the following features: - Existing OFW API and semantics unchanged - As such, PowerPC, PCI, etc. work fine with no changes - Details encapsulated in MD code, so individual platforms can implement this however they like - arm/arm/intr.c (or whatever) only needs a method to allocate a fresh interrupt, with no state, and anoter to set the device_t for an interrupt sometime later. - The internal table in the platform interrupt code has no knowledge of any mappings whatsoever except having the appropriate device_t for the PIC stored with the interrupt vector. - Device tree bits handled purely in device tree code - No action need be taken on any mapping until the interrupt is actually allocated/set up, like r301453 - Easy to add more mapping mechanisms (e.g. ACPI) by having similar enumeration-mechanism-specific code in the root bus for that mapping mechanism. -------------- End parenthesis ------------------------------- > `/ > >> Some more comments below. >> >>>> The ideal API would be one in which the resource enumeration code >>>> could assign, instead of numbers, some structured information that >>>> contains the full firmware specifier (string parent + interrupt pin >>>> for ACPI, interrupt parent phandle + specifier for device tree, bare >>>> numbers for simple systems). >>> Yes, exactly. I am happy that at least we agree on something. The only >>> difference is that we want triplet >> specifier> or >> Which makes perfect sense and is a good idea. >> >>>> That seemed, and seems, too invasive. I think we agree on this and >>>> have chosen to approximate that API in two different ways. >>> I still think that we found way how we can realize the above idea in >>> non-invasive way. >>> From my point of view the solution is: >>> Put/attach the above triplet to RLE and then copy it (within >>> bus_resource_attach() to struct resource. >>> But yes, I understand that I am not able to explain clearly enough. >> I think we were just talking past eachother somehow. >> >> Anyway, my concern about putting this in struct resource * at >> bus_alloc_resource() time is basically that there are a whole bunch of >> cases in which (a) it's hard to reverse engineer which interrupt from >> the device tree (or whatever) was meant to correspond to a particular >> arbitrary IRQ in an allocation request unless you keep logs and (b) >> there are a bunch of cases, mostly involving PCI, where interrupt >> resource allocation doesn't proceed through struct resource * at all, >> so you pretty much have to keep logs. Once you start keeping logs of >> which IRQ corresponds to which specifier at the device level, you >> might as well just do it once in a centralized place. > Oki, i agree. o Great, I'm glad we're on the same page. >>> gd ` >>>> When we designed the interrupt mapping code, we evaluated a large >>>> number of possible approaches. Something very much like this was one >>>> of them. It was rejected as being too fragile (it requires resource >>>> allocation and enumeration to be coupled) and to have too many corner >>>> cases (the PCI MSI and LSI APIs, for instance). The interrupt mapping >>>> stuff seemed at the time, and so far still seems, like the least-bad >>>> approximation to the ideal case: it is essentially that ideal API, in >>>> that it happens at bus enumeration and involves just assigning the >>>> firmware specifiers, but with lookup keys to that information stuffed >>>> into the 32-bit numbers that the rest of the system uses. >>> The raw PCI (or MSIX) case is relatively simple. The PCI domain uses >>> raw numbers and it's host-to-pci bridge job to translate raw numbers >>> from PCI domain to (for example) OFW based resource. >>> At this point, I'm still not sure about MSI... >> Right, you can keep a big table in the PCI driver that maps whatever >> IRQs you are handing out to some richer interrupt specifiers and >> consult that when you get a bus_alloc_resource() request. Which is >> basically the approach I'm advocating, except that the table is >> centralized, which reduces code duplication and fixes a number of >> weird corner cases where children assign extra interrupts to >> themselves, etc. and the bus parent has no ability to do something >> sensible. >> >> Please take a look at the attached interface and see if it (or >> something like it) meets your needs. It meets all of mine, at least, >> and I think fixes all the things you were concerned about. > Can you, please, give me also example for GPIO (these are identified by > pair). > Or, in other words, GPIO needs his own sets of functions? You have more experience with GPIOs than I do, but I could imagine a couple different ways of doing this for GPIO interrupts not in standard interrupts properties: 1. You have an extra interrupt class (like r301453) called "GPIO" or something that either takes a OFW/ACPI handle or a device_t and a GPIO specifier. 2. You have some global translator function that makes an OFW/ACPI interrupt specifier out of a GPIO property. I don't like this one. 3. You add a method to the GPIO controller (GPIO_GET_INTERRUPT_FOR_PIN or something) that does something driver specific (e.g. checks if you can have such an interrupt, then fabricates an OF- or ACPI-compatible specifier and maps it) to return a usable IRQ or an error. This also abstracts out how you want to describe the GPIO interrupt domain. Aside from not liking #2, I don't have strong opinions. #1 and #3 aren't mutually exclusive (#3 could be implemented in terms of #1), and #3 seems a little cleaner, so I would have a weak preference for #3 as the canonical mechanism. But anything is OK, really. > And thanks for your effort and patience, Of course -- apologies for being somewhat strident. Thanks for taking the time to discuss this! -Nathan > Michal > >