From owner-svn-src-stable@FreeBSD.ORG Sun Mar 1 04:22:12 2015 Return-Path: Delivered-To: svn-src-stable@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [8.8.178.115]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by hub.freebsd.org (Postfix) with ESMTPS id 8A98E77; Sun, 1 Mar 2015 04:22:12 +0000 (UTC) Received: from svn.freebsd.org (svn.freebsd.org [IPv6:2001:1900:2254:2068::e6a:0]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by mx1.freebsd.org (Postfix) with ESMTPS id 74036B55; Sun, 1 Mar 2015 04:22:12 +0000 (UTC) Received: from svn.freebsd.org ([127.0.1.70]) by svn.freebsd.org (8.14.9/8.14.9) with ESMTP id t214MCJ8099890; Sun, 1 Mar 2015 04:22:12 GMT (envelope-from rstone@FreeBSD.org) Received: (from rstone@localhost) by svn.freebsd.org (8.14.9/8.14.9/Submit) id t214M7Ib099859; Sun, 1 Mar 2015 04:22:07 GMT (envelope-from rstone@FreeBSD.org) Message-Id: <201503010422.t214M7Ib099859@svn.freebsd.org> X-Authentication-Warning: svn.freebsd.org: rstone set sender to rstone@FreeBSD.org using -f From: Ryan Stone Date: Sun, 1 Mar 2015 04:22:07 +0000 (UTC) To: src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-stable@freebsd.org, svn-src-stable-10@freebsd.org Subject: svn commit: r279470 - in stable/10: sys/amd64/vmm/amd sys/amd64/vmm/intel sys/amd64/vmm/io sys/conf sys/dev/pci sys/x86/iommu usr.sbin/pciconf X-SVN-Group: stable-10 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-BeenThere: svn-src-stable@freebsd.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: SVN commit messages for all the -stable branches of the src tree List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Mar 2015 04:22:12 -0000 Author: rstone Date: Sun Mar 1 04:22:06 2015 New Revision: 279470 URL: https://svnweb.freebsd.org/changeset/base/279470 Log: MFC r264007,r264008,r264009,r264011,r264012,r264013 MFC support for PCI Alternate RID Interpretation. ARI is an optional PCIe feature that allows PCI devices to present up to 256 functions on a bus. This is effectively a prerequisite for PCI SR-IOV support. r264007: Add a method to get the PCI RID for a device. Reviewed by: kib MFC after: 2 months Sponsored by: Sandvine Inc. r264008: Re-implement the DMAR I/O MMU code in terms of PCI RIDs Under the hood the VT-d spec is really implemented in terms of PCI RIDs instead of bus/slot/function, even though the spec makes pains to convert back to bus/slot/function in examples. However working with bus/slot/function is not correct when PCI ARI is in use, so convert to using RIDs in most cases. bus/slot/function will only be used when reporting errors to a user. Reviewed by: kib MFC after: 2 months Sponsored by: Sandvine Inc. r264009: Re-write bhyve's I/O MMU handling in terms of PCI RID. Reviewed by: neel MFC after: 2 months Sponsored by: Sandvine Inc. r264011: Add support for PCIe ARI PCIe Alternate RID Interpretation (ARI) is an optional feature that allows devices to have up to 256 different functions. It is implemented by always setting the PCI slot number to 0 and re-purposing the 5 bits used to encode the slot number to instead contain the function number. Combined with the original 3 bits allocated for the function number, this allows for 256 functions. This is enabled by default, but it's expected to be a no-op on currently supported hardware. It's a prerequisite for supporting PCI SR-IOV, and I want the ARI support to go in early to help shake out any bugs in it. ARI can be disabled by setting the tunable hw.pci.enable_ari=0. Reviewed by: kib MFC after: 2 months Sponsored by: Sandvine Inc. r264012: Print status of ARI capability in pciconf -c Teach pciconf how to print out the status (enabled/disabled) of the ARI capability on PCI Root Complexes and Downstream Ports. MFC after: 2 months Sponsored by: Sandvine Inc. r264013: Add missing copyright date. MFC after: 2 months Added: stable/10/sys/dev/pci/pcib_support.c - copied, changed from r264007, head/sys/dev/pci/pcib_support.c Modified: stable/10/sys/amd64/vmm/amd/amdv.c stable/10/sys/amd64/vmm/intel/vtd.c stable/10/sys/amd64/vmm/io/iommu.c stable/10/sys/amd64/vmm/io/iommu.h stable/10/sys/amd64/vmm/io/ppt.c stable/10/sys/conf/files stable/10/sys/dev/pci/pci.c stable/10/sys/dev/pci/pci_if.m stable/10/sys/dev/pci/pci_pci.c stable/10/sys/dev/pci/pcib_if.m stable/10/sys/dev/pci/pcib_private.h stable/10/sys/dev/pci/pcireg.h stable/10/sys/dev/pci/pcivar.h stable/10/sys/x86/iommu/busdma_dmar.c stable/10/sys/x86/iommu/intel_ctx.c stable/10/sys/x86/iommu/intel_dmar.h stable/10/sys/x86/iommu/intel_drv.c stable/10/sys/x86/iommu/intel_fault.c stable/10/sys/x86/iommu/intel_utils.c stable/10/usr.sbin/pciconf/cap.c Directory Properties: stable/10/ (props changed) Modified: stable/10/sys/amd64/vmm/amd/amdv.c ============================================================================== --- stable/10/sys/amd64/vmm/amd/amdv.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/amd64/vmm/amd/amdv.c Sun Mar 1 04:22:06 2015 (r279470) @@ -99,14 +99,14 @@ amd_iommu_remove_mapping(void *domain, v } static void -amd_iommu_add_device(void *domain, int bus, int slot, int func) +amd_iommu_add_device(void *domain, uint16_t rid) { printf("amd_iommu_add_device: not implemented\n"); } static void -amd_iommu_remove_device(void *domain, int bus, int slot, int func) +amd_iommu_remove_device(void *domain, uint16_t rid) { printf("amd_iommu_remove_device: not implemented\n"); Modified: stable/10/sys/amd64/vmm/intel/vtd.c ============================================================================== --- stable/10/sys/amd64/vmm/intel/vtd.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/amd64/vmm/intel/vtd.c Sun Mar 1 04:22:06 2015 (r279470) @@ -99,6 +99,8 @@ struct vtdmap { #define VTD_PTE_SUPERPAGE (1UL << 7) #define VTD_PTE_ADDR_M (0x000FFFFFFFFFF000UL) +#define VTD_RID2IDX(rid) (((rid) & 0xff) * 2) + struct domain { uint64_t *ptp; /* first level page table page */ int pt_levels; /* number of page table levels */ @@ -360,27 +362,24 @@ vtd_disable(void) } static void -vtd_add_device(void *arg, int bus, int slot, int func) +vtd_add_device(void *arg, uint16_t rid) { int idx; uint64_t *ctxp; struct domain *dom = arg; vm_paddr_t pt_paddr; struct vtdmap *vtdmap; - - if (bus < 0 || bus > PCI_BUSMAX || - slot < 0 || slot > PCI_SLOTMAX || - func < 0 || func > PCI_FUNCMAX) - panic("vtd_add_device: invalid bsf %d/%d/%d", bus, slot, func); + uint8_t bus; vtdmap = vtdmaps[0]; + bus = PCI_RID2BUS(rid); ctxp = ctx_tables[bus]; pt_paddr = vtophys(dom->ptp); - idx = (slot << 3 | func) * 2; + idx = VTD_RID2IDX(rid); if (ctxp[idx] & VTD_CTX_PRESENT) { - panic("vtd_add_device: device %d/%d/%d is already owned by " - "domain %d", bus, slot, func, + panic("vtd_add_device: device %x is already owned by " + "domain %d", rid, (uint16_t)(ctxp[idx + 1] >> 8)); } @@ -404,19 +403,16 @@ vtd_add_device(void *arg, int bus, int s } static void -vtd_remove_device(void *arg, int bus, int slot, int func) +vtd_remove_device(void *arg, uint16_t rid) { int i, idx; uint64_t *ctxp; struct vtdmap *vtdmap; + uint8_t bus; - if (bus < 0 || bus > PCI_BUSMAX || - slot < 0 || slot > PCI_SLOTMAX || - func < 0 || func > PCI_FUNCMAX) - panic("vtd_add_device: invalid bsf %d/%d/%d", bus, slot, func); - + bus = PCI_RID2BUS(rid); ctxp = ctx_tables[bus]; - idx = (slot << 3 | func) * 2; + idx = VTD_RID2IDX(rid); /* * Order is important. The 'present' bit is must be cleared first. Modified: stable/10/sys/amd64/vmm/io/iommu.c ============================================================================== --- stable/10/sys/amd64/vmm/io/iommu.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/amd64/vmm/io/iommu.c Sun Mar 1 04:22:06 2015 (r279470) @@ -109,19 +109,19 @@ IOMMU_REMOVE_MAPPING(void *domain, vm_pa } static __inline void -IOMMU_ADD_DEVICE(void *domain, int bus, int slot, int func) +IOMMU_ADD_DEVICE(void *domain, uint16_t rid) { if (ops != NULL && iommu_avail) - (*ops->add_device)(domain, bus, slot, func); + (*ops->add_device)(domain, rid); } static __inline void -IOMMU_REMOVE_DEVICE(void *domain, int bus, int slot, int func) +IOMMU_REMOVE_DEVICE(void *domain, uint16_t rid) { if (ops != NULL && iommu_avail) - (*ops->remove_device)(domain, bus, slot, func); + (*ops->remove_device)(domain, rid); } static __inline void @@ -196,7 +196,8 @@ iommu_init(void) continue; /* everything else belongs to the host domain */ - iommu_add_device(host_domain, bus, slot, func); + iommu_add_device(host_domain, + pci_get_rid(dev)); } } } @@ -263,17 +264,17 @@ iommu_host_domain(void) } void -iommu_add_device(void *dom, int bus, int slot, int func) +iommu_add_device(void *dom, uint16_t rid) { - IOMMU_ADD_DEVICE(dom, bus, slot, func); + IOMMU_ADD_DEVICE(dom, rid); } void -iommu_remove_device(void *dom, int bus, int slot, int func) +iommu_remove_device(void *dom, uint16_t rid) { - IOMMU_REMOVE_DEVICE(dom, bus, slot, func); + IOMMU_REMOVE_DEVICE(dom, rid); } void Modified: stable/10/sys/amd64/vmm/io/iommu.h ============================================================================== --- stable/10/sys/amd64/vmm/io/iommu.h Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/amd64/vmm/io/iommu.h Sun Mar 1 04:22:06 2015 (r279470) @@ -39,8 +39,8 @@ typedef uint64_t (*iommu_create_mapping_ vm_paddr_t hpa, uint64_t len); typedef uint64_t (*iommu_remove_mapping_t)(void *domain, vm_paddr_t gpa, uint64_t len); -typedef void (*iommu_add_device_t)(void *domain, int bus, int slot, int func); -typedef void (*iommu_remove_device_t)(void *dom, int bus, int slot, int func); +typedef void (*iommu_add_device_t)(void *domain, uint16_t rid); +typedef void (*iommu_remove_device_t)(void *dom, uint16_t rid); typedef void (*iommu_invalidate_tlb_t)(void *dom); struct iommu_ops { @@ -69,7 +69,7 @@ void iommu_destroy_domain(void *dom); void iommu_create_mapping(void *dom, vm_paddr_t gpa, vm_paddr_t hpa, size_t len); void iommu_remove_mapping(void *dom, vm_paddr_t gpa, size_t len); -void iommu_add_device(void *dom, int bus, int slot, int func); -void iommu_remove_device(void *dom, int bus, int slot, int func); +void iommu_add_device(void *dom, uint16_t rid); +void iommu_remove_device(void *dom, uint16_t rid); void iommu_invalidate_tlb(void *domain); #endif Modified: stable/10/sys/amd64/vmm/io/ppt.c ============================================================================== --- stable/10/sys/amd64/vmm/io/ppt.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/amd64/vmm/io/ppt.c Sun Mar 1 04:22:06 2015 (r279470) @@ -346,7 +346,7 @@ ppt_assign_device(struct vm *vm, int bus return (EBUSY); ppt->vm = vm; - iommu_add_device(vm_iommu_domain(vm), bus, slot, func); + iommu_add_device(vm_iommu_domain(vm), pci_get_rid(ppt->dev)); return (0); } return (ENOENT); @@ -367,7 +367,7 @@ ppt_unassign_device(struct vm *vm, int b ppt_unmap_mmio(vm, ppt); ppt_teardown_msi(ppt); ppt_teardown_msix(ppt); - iommu_remove_device(vm_iommu_domain(vm), bus, slot, func); + iommu_remove_device(vm_iommu_domain(vm), pci_get_rid(ppt->dev)); ppt->vm = NULL; return (0); } Modified: stable/10/sys/conf/files ============================================================================== --- stable/10/sys/conf/files Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/conf/files Sun Mar 1 04:22:06 2015 (r279470) @@ -1992,6 +1992,7 @@ dev/pci/pci_pci.c optional pci dev/pci/pci_subr.c optional pci dev/pci/pci_user.c optional pci dev/pci/pcib_if.m standard +dev/pci/pcib_support.c standard dev/pci/vga_pci.c optional pci dev/pcn/if_pcn.c optional pcn pci dev/pdq/if_fea.c optional fea eisa Modified: stable/10/sys/dev/pci/pci.c ============================================================================== --- stable/10/sys/dev/pci/pci.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pci.c Sun Mar 1 04:22:06 2015 (r279470) @@ -121,6 +121,8 @@ static void pci_resume_msix(device_t de static int pci_remap_intr_method(device_t bus, device_t dev, u_int irq); +static uint16_t pci_get_rid_method(device_t dev, device_t child); + static device_method_t pci_methods[] = { /* Device interface */ DEVMETHOD(device_probe, pci_probe), @@ -175,6 +177,7 @@ static device_method_t pci_methods[] = { DEVMETHOD(pci_release_msi, pci_release_msi_method), DEVMETHOD(pci_msi_count, pci_msi_count_method), DEVMETHOD(pci_msix_count, pci_msix_count_method), + DEVMETHOD(pci_get_rid, pci_get_rid_method), DEVMETHOD_END }; @@ -358,6 +361,11 @@ TUNABLE_INT("hw.pci.clear_bars", &pci_cl SYSCTL_INT(_hw_pci, OID_AUTO, clear_bars, CTLFLAG_RDTUN, &pci_clear_bars, 0, "Ignore firmware-assigned resources for BARs."); +static int pci_enable_ari = 1; +TUNABLE_INT("hw.pci.enable_ari", &pci_enable_ari); +SYSCTL_INT(_hw_pci, OID_AUTO, enable_ari, CTLFLAG_RDTUN, &pci_enable_ari, + 0, "Enable support for PCIe Alternative RID Interpretation"); + static int pci_has_quirk(uint32_t devid, int quirk) { @@ -3292,6 +3300,19 @@ pci_add_resources(device_t bus, device_t } } +static struct pci_devinfo * +pci_identify_function(device_t pcib, device_t dev, int domain, int busno, + int slot, int func, size_t dinfo_size) +{ + struct pci_devinfo *dinfo; + + dinfo = pci_read_device(pcib, domain, busno, slot, func, dinfo_size); + if (dinfo != NULL) + pci_add_child(dev, dinfo); + + return (dinfo); +} + void pci_add_children(device_t dev, int domain, int busno, size_t dinfo_size) { @@ -3301,6 +3322,24 @@ pci_add_children(device_t dev, int domai int maxslots; int s, f, pcifunchigh; uint8_t hdrtype; + int first_func; + + /* + * Try to detect a device at slot 0, function 0. If it exists, try to + * enable ARI. We must enable ARI before detecting the rest of the + * functions on this bus as ARI changes the set of slots and functions + * that are legal on this bus. + */ + dinfo = pci_identify_function(pcib, dev, domain, busno, 0, 0, + dinfo_size); + if (dinfo != NULL && pci_enable_ari) + PCIB_TRY_ENABLE_ARI(pcib, dinfo->cfg.dev); + + /* + * Start looking for new devices on slot 0 at function 1 because we + * just identified the device at slot 0, function 0. + */ + first_func = 1; KASSERT(dinfo_size >= sizeof(struct pci_devinfo), ("dinfo_size too small")); @@ -3313,14 +3352,13 @@ pci_add_children(device_t dev, int domai if ((hdrtype & PCIM_HDRTYPE) > PCI_MAXHDRTYPE) continue; if (hdrtype & PCIM_MFDEV) - pcifunchigh = PCI_FUNCMAX; - for (f = 0; f <= pcifunchigh; f++) { - dinfo = pci_read_device(pcib, domain, busno, s, f, + pcifunchigh = PCIB_MAXFUNCS(pcib); + for (f = first_func; f <= pcifunchigh; f++) + pci_identify_function(pcib, dev, domain, busno, s, f, dinfo_size); - if (dinfo != NULL) { - pci_add_child(dev, dinfo); - } - } + + /* For slots after slot 0 we need to check for function 0. */ + first_func = 0; } #undef REG } @@ -4877,3 +4915,10 @@ pci_restore_state(device_t dev) dinfo = device_get_ivars(dev); pci_cfg_restore(dev, dinfo); } + +static uint16_t +pci_get_rid_method(device_t dev, device_t child) +{ + + return (PCIB_GET_RID(device_get_parent(dev), child)); +} Modified: stable/10/sys/dev/pci/pci_if.m ============================================================================== --- stable/10/sys/dev/pci/pci_if.m Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pci_if.m Sun Mar 1 04:22:06 2015 (r279470) @@ -159,3 +159,9 @@ METHOD int msix_count { device_t dev; device_t child; } DEFAULT null_msi_count; + +METHOD uint16_t get_rid { + device_t dev; + device_t child; +}; + Modified: stable/10/sys/dev/pci/pci_pci.c ============================================================================== --- stable/10/sys/dev/pci/pci_pci.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pci_pci.c Sun Mar 1 04:22:06 2015 (r279470) @@ -56,6 +56,14 @@ static int pcib_suspend(device_t dev); static int pcib_resume(device_t dev); static int pcib_power_for_sleep(device_t pcib, device_t dev, int *pstate); +static uint16_t pcib_ari_get_rid(device_t pcib, device_t dev); +static uint32_t pcib_read_config(device_t dev, u_int b, u_int s, + u_int f, u_int reg, int width); +static void pcib_write_config(device_t dev, u_int b, u_int s, + u_int f, u_int reg, uint32_t val, int width); +static int pcib_ari_maxslots(device_t dev); +static int pcib_ari_maxfuncs(device_t dev); +static int pcib_try_enable_ari(device_t pcib, device_t dev); static device_method_t pcib_methods[] = { /* Device interface */ @@ -83,7 +91,8 @@ static device_method_t pcib_methods[] = DEVMETHOD(bus_teardown_intr, bus_generic_teardown_intr), /* pcib interface */ - DEVMETHOD(pcib_maxslots, pcib_maxslots), + DEVMETHOD(pcib_maxslots, pcib_ari_maxslots), + DEVMETHOD(pcib_maxfuncs, pcib_ari_maxfuncs), DEVMETHOD(pcib_read_config, pcib_read_config), DEVMETHOD(pcib_write_config, pcib_write_config), DEVMETHOD(pcib_route_interrupt, pcib_route_interrupt), @@ -93,6 +102,8 @@ static device_method_t pcib_methods[] = DEVMETHOD(pcib_release_msix, pcib_release_msix), DEVMETHOD(pcib_map_msi, pcib_map_msi), DEVMETHOD(pcib_power_for_sleep, pcib_power_for_sleep), + DEVMETHOD(pcib_get_rid, pcib_ari_get_rid), + DEVMETHOD(pcib_try_enable_ari, pcib_try_enable_ari), DEVMETHOD_END }; @@ -1630,27 +1641,94 @@ pcib_alloc_resource(device_t dev, device #endif /* + * If ARI is enabled on this downstream port, translate the function number + * to the non-ARI slot/function. The downstream port will convert it back in + * hardware. If ARI is not enabled slot and func are not modified. + */ +static __inline void +pcib_xlate_ari(device_t pcib, int bus, int *slot, int *func) +{ + struct pcib_softc *sc; + int ari_func; + + sc = device_get_softc(pcib); + ari_func = *func; + + if (sc->flags & PCIB_ENABLE_ARI) { + KASSERT(*slot == 0, + ("Non-zero slot number with ARI enabled!")); + *slot = PCIE_ARI_SLOT(ari_func); + *func = PCIE_ARI_FUNC(ari_func); + } +} + + +static void +pcib_enable_ari(struct pcib_softc *sc, uint32_t pcie_pos) +{ + uint32_t ctl2; + + ctl2 = pci_read_config(sc->dev, pcie_pos + PCIER_DEVICE_CTL2, 4); + ctl2 |= PCIEM_CTL2_ARI; + pci_write_config(sc->dev, pcie_pos + PCIER_DEVICE_CTL2, ctl2, 4); + + sc->flags |= PCIB_ENABLE_ARI; +} + +/* * PCIB interface. */ int pcib_maxslots(device_t dev) { - return(PCI_SLOTMAX); + return (PCI_SLOTMAX); +} + +static int +pcib_ari_maxslots(device_t dev) +{ + struct pcib_softc *sc; + + sc = device_get_softc(dev); + + if (sc->flags & PCIB_ENABLE_ARI) + return (PCIE_ARI_SLOTMAX); + else + return (PCI_SLOTMAX); +} + +static int +pcib_ari_maxfuncs(device_t dev) +{ + struct pcib_softc *sc; + + sc = device_get_softc(dev); + + if (sc->flags & PCIB_ENABLE_ARI) + return (PCIE_ARI_FUNCMAX); + else + return (PCI_FUNCMAX); } /* * Since we are a child of a PCI bus, its parent must support the pcib interface. */ -uint32_t +static uint32_t pcib_read_config(device_t dev, u_int b, u_int s, u_int f, u_int reg, int width) { - return(PCIB_READ_CONFIG(device_get_parent(device_get_parent(dev)), b, s, f, reg, width)); + + pcib_xlate_ari(dev, b, &s, &f); + return(PCIB_READ_CONFIG(device_get_parent(device_get_parent(dev)), b, s, + f, reg, width)); } -void +static void pcib_write_config(device_t dev, u_int b, u_int s, u_int f, u_int reg, uint32_t val, int width) { - PCIB_WRITE_CONFIG(device_get_parent(device_get_parent(dev)), b, s, f, reg, val, width); + + pcib_xlate_ari(dev, b, &s, &f); + PCIB_WRITE_CONFIG(device_get_parent(device_get_parent(dev)), b, s, f, + reg, val, width); } /* @@ -1762,3 +1840,83 @@ pcib_power_for_sleep(device_t pcib, devi bus = device_get_parent(pcib); return (PCIB_POWER_FOR_SLEEP(bus, dev, pstate)); } + +static uint16_t +pcib_ari_get_rid(device_t pcib, device_t dev) +{ + struct pcib_softc *sc; + uint8_t bus, slot, func; + + sc = device_get_softc(pcib); + + if (sc->flags & PCIB_ENABLE_ARI) { + bus = pci_get_bus(dev); + func = pci_get_function(dev); + + return (PCI_ARI_RID(bus, func)); + } else { + bus = pci_get_bus(dev); + slot = pci_get_slot(dev); + func = pci_get_function(dev); + + return (PCI_RID(bus, slot, func)); + } +} + +/* + * Check that the downstream port (pcib) and the endpoint device (dev) both + * support ARI. If so, enable it and return 0, otherwise return an error. + */ +static int +pcib_try_enable_ari(device_t pcib, device_t dev) +{ + struct pcib_softc *sc; + int error; + uint32_t cap2; + int ari_cap_off; + uint32_t ari_ver; + uint32_t pcie_pos; + + sc = device_get_softc(pcib); + + /* + * ARI is controlled in a register in the PCIe capability structure. + * If the downstream port does not have the PCIe capability structure + * then it does not support ARI. + */ + error = pci_find_cap(pcib, PCIY_EXPRESS, &pcie_pos); + if (error != 0) + return (ENODEV); + + /* Check that the PCIe port advertises ARI support. */ + cap2 = pci_read_config(pcib, pcie_pos + PCIER_DEVICE_CAP2, 4); + if (!(cap2 & PCIEM_CAP2_ARI)) + return (ENODEV); + + /* + * Check that the endpoint device advertises ARI support via the ARI + * extended capability structure. + */ + error = pci_find_extcap(dev, PCIZ_ARI, &ari_cap_off); + if (error != 0) + return (ENODEV); + + /* + * Finally, check that the endpoint device supports the same version + * of ARI that we do. + */ + ari_ver = pci_read_config(dev, ari_cap_off, 4); + if (PCI_EXTCAP_VER(ari_ver) != PCIB_SUPPORTED_ARI_VER) { + if (bootverbose) + device_printf(pcib, + "Unsupported version of ARI (%d) detected\n", + PCI_EXTCAP_VER(ari_ver)); + + return (ENXIO); + } + + pcib_enable_ari(sc, pcie_pos); + + return (0); +} + Modified: stable/10/sys/dev/pci/pcib_if.m ============================================================================== --- stable/10/sys/dev/pci/pcib_if.m Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pcib_if.m Sun Mar 1 04:22:06 2015 (r279470) @@ -27,7 +27,9 @@ # #include +#include #include +#include INTERFACE pcib; @@ -47,6 +49,14 @@ METHOD int maxslots { }; # +# +# Return the number of functions on the attached PCI bus. +# +METHOD int maxfuncs { + device_t dev; +} DEFAULT pcib_maxfuncs; + +# # Read configuration space on the PCI bus. The bus, slot and func # arguments determine the device which is being read and the reg # argument is a byte offset into configuration space for that @@ -154,3 +164,21 @@ METHOD int power_for_sleep { device_t dev; int *pstate; }; + +# +# Return the PCI Routing Identifier (RID) for the device. +# +METHOD uint16_t get_rid { + device_t pcib; + device_t dev; +} DEFAULT pcib_get_rid; + +# +# Enable Alternative RID Interpretation if both the downstream port (pcib) +# and the endpoint device (dev) both support it. +# +METHOD int try_enable_ari { + device_t pcib; + device_t dev; +}; + Modified: stable/10/sys/dev/pci/pcib_private.h ============================================================================== --- stable/10/sys/dev/pci/pcib_private.h Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pcib_private.h Sun Mar 1 04:22:06 2015 (r279470) @@ -93,6 +93,7 @@ struct pcib_softc #define PCIB_SUBTRACTIVE 0x1 #define PCIB_DISABLE_MSI 0x2 #define PCIB_DISABLE_MSIX 0x4 +#define PCIB_ENABLE_ARI 0x8 uint16_t command; /* command register */ u_int domain; /* domain number */ u_int pribus; /* primary bus number */ @@ -115,6 +116,8 @@ struct pcib_softc uint8_t seclat; /* secondary bus latency timer */ }; +#define PCIB_SUPPORTED_ARI_VER 1 + typedef uint32_t pci_read_config_fn(int b, int s, int f, int reg, int width); #ifdef NEW_PCIB @@ -135,13 +138,13 @@ int pcib_release_resource(device_t dev, struct resource *r); #endif int pcib_maxslots(device_t dev); -uint32_t pcib_read_config(device_t dev, u_int b, u_int s, u_int f, u_int reg, int width); -void pcib_write_config(device_t dev, u_int b, u_int s, u_int f, u_int reg, uint32_t val, int width); +int pcib_maxfuncs(device_t dev); int pcib_route_interrupt(device_t pcib, device_t dev, int pin); int pcib_alloc_msi(device_t pcib, device_t dev, int count, int maxcount, int *irqs); int pcib_release_msi(device_t pcib, device_t dev, int count, int *irqs); int pcib_alloc_msix(device_t pcib, device_t dev, int *irq); int pcib_release_msix(device_t pcib, device_t dev, int irq); int pcib_map_msi(device_t pcib, device_t dev, int irq, uint64_t *addr, uint32_t *data); +uint16_t pcib_get_rid(device_t pcib, device_t dev); #endif Copied and modified: stable/10/sys/dev/pci/pcib_support.c (from r264007, head/sys/dev/pci/pcib_support.c) ============================================================================== --- head/sys/dev/pci/pcib_support.c Tue Apr 1 15:47:24 2014 (r264007, copy source) +++ stable/10/sys/dev/pci/pcib_support.c Sun Mar 1 04:22:06 2015 (r279470) @@ -1,5 +1,5 @@ /* - * Copyright (c) Sandvine Inc. All rights reserved. + * Copyright (c) 2014 Sandvine Inc. All rights reserved. * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -48,6 +48,12 @@ __FBSDID("$FreeBSD$"); #include "pcib_if.h" +int +pcib_maxfuncs(device_t dev) +{ + return (PCI_FUNCMAX); +} + uint16_t pcib_get_rid(device_t pcib, device_t dev) { Modified: stable/10/sys/dev/pci/pcireg.h ============================================================================== --- stable/10/sys/dev/pci/pcireg.h Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pcireg.h Sun Mar 1 04:22:06 2015 (r279470) @@ -48,6 +48,29 @@ #define PCIE_REGMAX 4095 /* highest supported config register addr. */ #define PCI_MAXHDRTYPE 2 +#define PCIE_ARI_SLOTMAX 0 +#define PCIE_ARI_FUNCMAX 255 + +#define PCI_RID_BUS_SHIFT 8 +#define PCI_RID_SLOT_SHIFT 3 +#define PCI_RID_FUNC_SHIFT 0 + +#define PCI_RID(bus, slot, func) \ + ((((bus) & PCI_BUSMAX) << PCI_RID_BUS_SHIFT) | \ + (((slot) & PCI_SLOTMAX) << PCI_RID_SLOT_SHIFT) | \ + (((func) & PCI_FUNCMAX) << PCI_RID_FUNC_SHIFT)) + +#define PCI_ARI_RID(bus, func) \ + ((((bus) & PCI_BUSMAX) << PCI_RID_BUS_SHIFT) | \ + (((func) & PCIE_ARI_FUNCMAX) << PCI_RID_FUNC_SHIFT)) + +#define PCI_RID2BUS(rid) (((rid) >> PCI_RID_BUS_SHIFT) & PCI_BUSMAX) +#define PCI_RID2SLOT(rid) (((rid) >> PCI_RID_SLOT_SHIFT) & PCI_SLOTMAX) +#define PCI_RID2FUNC(rid) (((rid) >> PCI_RID_FUNC_SHIFT) & PCI_FUNCMAX) + +#define PCIE_ARI_SLOT(func) (((func) >> PCI_RID_SLOT_SHIFT) & PCI_SLOTMAX) +#define PCIE_ARI_FUNC(func) (((func) >> PCI_RID_FUNC_SHIFT) & PCI_FUNCMAX) + /* PCI config header registers for all devices */ #define PCIR_DEVVENDOR 0x00 @@ -775,6 +798,7 @@ #define PCIEM_ROOT_STA_PME_STATUS 0x00010000 #define PCIEM_ROOT_STA_PME_PEND 0x00020000 #define PCIER_DEVICE_CAP2 0x24 +#define PCIEM_CAP2_ARI 0x20 #define PCIER_DEVICE_CTL2 0x28 #define PCIEM_CTL2_COMP_TIMEOUT_VAL 0x000f #define PCIEM_CTL2_COMP_TIMEOUT_DIS 0x0010 @@ -895,3 +919,4 @@ /* Serial Number definitions */ #define PCIR_SERIAL_LOW 0x04 #define PCIR_SERIAL_HIGH 0x08 + Modified: stable/10/sys/dev/pci/pcivar.h ============================================================================== --- stable/10/sys/dev/pci/pcivar.h Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/dev/pci/pcivar.h Sun Mar 1 04:22:06 2015 (r279470) @@ -482,6 +482,12 @@ pci_msix_count(device_t dev) return (PCI_MSIX_COUNT(device_get_parent(dev), dev)); } +static __inline uint16_t +pci_get_rid(device_t dev) +{ + return (PCI_GET_RID(device_get_parent(dev), dev)); +} + device_t pci_find_bsf(uint8_t, uint8_t, uint8_t); device_t pci_find_dbsf(uint32_t, uint8_t, uint8_t, uint8_t); device_t pci_find_device(uint16_t, uint16_t); Modified: stable/10/sys/x86/iommu/busdma_dmar.c ============================================================================== --- stable/10/sys/x86/iommu/busdma_dmar.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/x86/iommu/busdma_dmar.c Sun Mar 1 04:22:06 2015 (r279470) @@ -93,7 +93,7 @@ dmar_bus_dma_is_dev_disabled(int domain, * bounce mapping. */ static device_t -dmar_get_requester(device_t dev, int *bus, int *slot, int *func) +dmar_get_requester(device_t dev, uint16_t *rid) { devclass_t pci_class; device_t pci, pcib, requester; @@ -102,9 +102,7 @@ dmar_get_requester(device_t dev, int *bu pci_class = devclass_find("pci"); requester = dev; - *bus = pci_get_bus(dev); - *slot = pci_get_slot(dev); - *func = pci_get_function(dev); + *rid = pci_get_rid(dev); /* * Walk the bridge hierarchy from the target device to the @@ -161,8 +159,7 @@ dmar_get_requester(device_t dev, int *bu * same page tables for taken and * non-taken transactions. */ - *bus = pci_get_bus(dev); - *slot = *func = 0; + *rid = PCI_RID(pci_get_bus(dev), 0, 0); } else { /* * Neither the device nor the bridge @@ -171,9 +168,7 @@ dmar_get_requester(device_t dev, int *bu * will use the bridge's BSF as the * requester ID. */ - *bus = pci_get_bus(pcib); - *slot = pci_get_slot(pcib); - *func = pci_get_function(pcib); + *rid = pci_get_rid(pcib); } } /* @@ -193,9 +188,9 @@ dmar_instantiate_ctx(struct dmar_unit *d device_t requester; struct dmar_ctx *ctx; bool disabled; - int bus, slot, func; + uint16_t rid; - requester = dmar_get_requester(dev, &bus, &slot, &func); + requester = dmar_get_requester(dev, &rid); /* * If the user requested the IOMMU disabled for the device, we @@ -204,9 +199,10 @@ dmar_instantiate_ctx(struct dmar_unit *d * Instead provide the identity mapping for the device * context. */ - disabled = dmar_bus_dma_is_dev_disabled(pci_get_domain(dev), bus, - slot, func); - ctx = dmar_get_ctx(dmar, requester, bus, slot, func, disabled, rmrr); + disabled = dmar_bus_dma_is_dev_disabled(pci_get_domain(requester), + pci_get_bus(requester), pci_get_slot(requester), + pci_get_function(requester)); + ctx = dmar_get_ctx(dmar, requester, rid, disabled, rmrr); if (ctx == NULL) return (NULL); if (disabled) { Modified: stable/10/sys/x86/iommu/intel_ctx.c ============================================================================== --- stable/10/sys/x86/iommu/intel_ctx.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/x86/iommu/intel_ctx.c Sun Mar 1 04:22:06 2015 (r279470) @@ -63,6 +63,7 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include #include static MALLOC_DEFINE(M_DMAR_CTX, "dmar_ctx", "Intel DMAR Context"); @@ -106,14 +107,14 @@ dmar_map_ctx_entry(struct dmar_ctx *ctx, { dmar_ctx_entry_t *ctxp; - ctxp = dmar_map_pgtbl(ctx->dmar->ctx_obj, 1 + ctx->bus, + ctxp = dmar_map_pgtbl(ctx->dmar->ctx_obj, 1 + PCI_RID2BUS(ctx->rid), DMAR_PGF_NOALLOC | DMAR_PGF_WAITOK, sfp); - ctxp += ((ctx->slot & 0x1f) << 3) + (ctx->func & 0x7); + ctxp += ctx->rid & 0xff; return (ctxp); } static void -ctx_tag_init(struct dmar_ctx *ctx) +ctx_tag_init(struct dmar_ctx *ctx, device_t dev) { bus_addr_t maxaddr; @@ -127,6 +128,7 @@ ctx_tag_init(struct dmar_ctx *ctx) ctx->ctx_tag.common.nsegments = BUS_SPACE_UNRESTRICTED; ctx->ctx_tag.common.maxsegsz = maxaddr; ctx->ctx_tag.ctx = ctx; + ctx->ctx_tag.owner = dev; /* XXXKIB initialize tag further */ } @@ -139,7 +141,10 @@ ctx_id_entry_init(struct dmar_ctx *ctx, unit = ctx->dmar; KASSERT(ctxp->ctx1 == 0 && ctxp->ctx2 == 0, ("dmar%d: initialized ctx entry %d:%d:%d 0x%jx 0x%jx", - unit->unit, ctx->bus, ctx->slot, ctx->func, ctxp->ctx1, + unit->unit, pci_get_bus(ctx->ctx_tag.owner), + pci_get_slot(ctx->ctx_tag.owner), + pci_get_function(ctx->ctx_tag.owner), + ctxp->ctx1, ctxp->ctx2)); ctxp->ctx2 = DMAR_CTX2_DID(ctx->domain); ctxp->ctx2 |= ctx->awlvl; @@ -229,7 +234,7 @@ ctx_init_rmrr(struct dmar_ctx *ctx, devi } static struct dmar_ctx * -dmar_get_ctx_alloc(struct dmar_unit *dmar, int bus, int slot, int func) +dmar_get_ctx_alloc(struct dmar_unit *dmar, uint16_t rid) { struct dmar_ctx *ctx; @@ -239,9 +244,7 @@ dmar_get_ctx_alloc(struct dmar_unit *dma TASK_INIT(&ctx->unload_task, 0, dmar_ctx_unload_task, ctx); mtx_init(&ctx->lock, "dmarctx", NULL, MTX_DEF); ctx->dmar = dmar; - ctx->bus = bus; - ctx->slot = slot; - ctx->func = func; + ctx->rid = rid; return (ctx); } @@ -264,19 +267,22 @@ dmar_ctx_dtr(struct dmar_ctx *ctx, bool } struct dmar_ctx * -dmar_get_ctx(struct dmar_unit *dmar, device_t dev, int bus, int slot, int func, - bool id_mapped, bool rmrr_init) +dmar_get_ctx(struct dmar_unit *dmar, device_t dev, uint16_t rid, bool id_mapped, + bool rmrr_init) { struct dmar_ctx *ctx, *ctx1; dmar_ctx_entry_t *ctxp; struct sf_buf *sf; - int error, mgaw; + int bus, slot, func, error, mgaw; bool enable; + bus = pci_get_bus(dev); + slot = pci_get_slot(dev); + func = pci_get_function(dev); enable = false; TD_PREP_PINNED_ASSERT; DMAR_LOCK(dmar); - ctx = dmar_find_ctx_locked(dmar, bus, slot, func); + ctx = dmar_find_ctx_locked(dmar, rid); error = 0; if (ctx == NULL) { /* @@ -285,7 +291,7 @@ dmar_get_ctx(struct dmar_unit *dmar, dev */ DMAR_UNLOCK(dmar); dmar_ensure_ctx_page(dmar, bus); - ctx1 = dmar_get_ctx_alloc(dmar, bus, slot, func); + ctx1 = dmar_get_ctx_alloc(dmar, rid); if (id_mapped) { /* @@ -353,7 +359,7 @@ dmar_get_ctx(struct dmar_unit *dmar, dev * Recheck the contexts, other thread might have * already allocated needed one. */ - ctx = dmar_find_ctx_locked(dmar, bus, slot, func); + ctx = dmar_find_ctx_locked(dmar, rid); if (ctx == NULL) { ctx = ctx1; ctx->ctx_tag.owner = dev; @@ -365,7 +371,7 @@ dmar_get_ctx(struct dmar_unit *dmar, dev TD_PINNED_ASSERT; return (NULL); } - ctx_tag_init(ctx); + ctx_tag_init(ctx, dev); /* * This is the first activated context for the @@ -527,14 +533,14 @@ dmar_free_ctx(struct dmar_ctx *ctx) } struct dmar_ctx * -dmar_find_ctx_locked(struct dmar_unit *dmar, int bus, int slot, int func) +dmar_find_ctx_locked(struct dmar_unit *dmar, uint16_t rid) { struct dmar_ctx *ctx; DMAR_ASSERT_LOCKED(dmar); LIST_FOREACH(ctx, &dmar->contexts, link) { - if (ctx->bus == bus && ctx->slot == slot && ctx->func == func) + if (ctx->rid == rid) return (ctx); } return (NULL); Modified: stable/10/sys/x86/iommu/intel_dmar.h ============================================================================== --- stable/10/sys/x86/iommu/intel_dmar.h Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/x86/iommu/intel_dmar.h Sun Mar 1 04:22:06 2015 (r279470) @@ -74,9 +74,7 @@ RB_PROTOTYPE(dmar_gas_entries_tree, dmar #define DMAR_MAP_ENTRY_TM 0x8000 /* Transient */ struct dmar_ctx { - int bus; /* pci bus/slot/func */ - int slot; - int func; + uint16_t rid; /* pci RID */ int domain; /* DID */ int mgaw; /* Real max address width */ int agaw; /* Adjusted guest address width */ @@ -272,12 +270,11 @@ void ctx_free_pgtbl(struct dmar_ctx *ctx struct dmar_ctx *dmar_instantiate_ctx(struct dmar_unit *dmar, device_t dev, bool rmrr); -struct dmar_ctx *dmar_get_ctx(struct dmar_unit *dmar, device_t dev, - int bus, int slot, int func, bool id_mapped, bool rmrr_init); +struct dmar_ctx *dmar_get_ctx(struct dmar_unit *dmar, device_t dev, + uint16_t rid, bool id_mapped, bool rmrr_init); void dmar_free_ctx_locked(struct dmar_unit *dmar, struct dmar_ctx *ctx); void dmar_free_ctx(struct dmar_ctx *ctx); -struct dmar_ctx *dmar_find_ctx_locked(struct dmar_unit *dmar, int bus, - int slot, int func); +struct dmar_ctx *dmar_find_ctx_locked(struct dmar_unit *dmar, uint16_t rid); void dmar_ctx_unload_entry(struct dmar_map_entry *entry, bool free); void dmar_ctx_unload(struct dmar_ctx *ctx, struct dmar_map_entries_tailq *entries, bool cansleep); Modified: stable/10/sys/x86/iommu/intel_drv.c ============================================================================== --- stable/10/sys/x86/iommu/intel_drv.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/x86/iommu/intel_drv.c Sun Mar 1 04:22:06 2015 (r279470) @@ -1006,7 +1006,9 @@ dmar_print_ctx(struct dmar_ctx *ctx, boo db_printf( " @%p pci%d:%d:%d dom %d mgaw %d agaw %d pglvl %d end %jx\n" " refs %d flags %x pgobj %p map_ents %u loads %lu unloads %lu\n", - ctx, ctx->bus, ctx->slot, ctx->func, ctx->domain, ctx->mgaw, + ctx, pci_get_bus(ctx->ctx_tag.owner), + pci_get_slot(ctx->ctx_tag.owner), + pci_get_function(ctx->ctx_tag.owner), ctx->domain, ctx->mgaw, ctx->agaw, ctx->pglvl, (uintmax_t)ctx->end, ctx->refs, ctx->flags, ctx->pgtbl_obj, ctx->entries_cnt, ctx->loads, ctx->unloads); @@ -1079,8 +1081,10 @@ DB_FUNC(dmar_ctx, db_dmar_print_ctx, db_ for (i = 0; i < dmar_devcnt; i++) { unit = device_get_softc(dmar_devs[i]); LIST_FOREACH(ctx, &unit->contexts, link) { - if (domain == unit->segment && bus == ctx->bus && - device == ctx->slot && function == ctx->func) { + if (domain == unit->segment && + bus == pci_get_bus(ctx->ctx_tag.owner) && + device == pci_get_slot(ctx->ctx_tag.owner) && + function == pci_get_function(ctx->ctx_tag.owner)) { dmar_print_ctx(ctx, show_mappings); goto out; } Modified: stable/10/sys/x86/iommu/intel_fault.c ============================================================================== --- stable/10/sys/x86/iommu/intel_fault.c Sun Mar 1 02:45:46 2015 (r279469) +++ stable/10/sys/x86/iommu/intel_fault.c Sun Mar 1 04:22:06 2015 (r279470) @@ -45,6 +45,8 @@ __FBSDID("$FreeBSD$"); #include #include #include +#include +#include #include #include #include @@ -203,19 +205,28 @@ dmar_fault_task(void *arg, int pending _ DMAR_FAULT_UNLOCK(unit); sid = DMAR_FRCD2_SID(fault_rec[1]); - bus = (sid >> 8) & 0xf; - slot = (sid >> 3) & 0x1f; - func = sid & 0x7; printf("DMAR%d: ", unit->unit); DMAR_LOCK(unit); - ctx = dmar_find_ctx_locked(unit, bus, slot, func); + ctx = dmar_find_ctx_locked(unit, sid); if (ctx == NULL) { printf(":"); + + /* + * Note that the slot and function will not be correct + * if ARI is in use, but without a ctx entry we have + * no way of knowing whether ARI is in use or not. + */ + bus = PCI_RID2BUS(sid); + slot = PCI_RID2SLOT(sid); *** DIFF OUTPUT TRUNCATED AT 1000 LINES ***