Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 14 Mar 2019 13:45:29 +0200
From:      Konstantin Belousov <kostikbel@gmail.com>
To:        Johannes Lundberg <johalun@FreeBSD.org>
Cc:        John Baldwin <jhb@FreeBSD.org>, cem@freebsd.org, Hans Petter Selasky <hselasky@freebsd.org>, src-committers <src-committers@freebsd.org>, svn-src-all <svn-src-all@freebsd.org>, svn-src-head <svn-src-head@freebsd.org>
Subject:   Re: svn commit: r345103 - head/sys/compat/linuxkpi/common/include/linux
Message-ID:  <20190314114529.GE2492@kib.kiev.ua>
In-Reply-To: <0521e2af-03ae-4642-f88c-60d8b8ede968@FreeBSD.org>
References:  <201903131915.x2DJFbRk002502@repo.freebsd.org> <CAG6CVpU0feif5Z99By7R93cVnNi-MjCsoGh_t9j3xhqhNR%2B%2BQA@mail.gmail.com> <fcfcbb36-a41d-6dda-0940-1e492d6dc676@FreeBSD.org> <0521e2af-03ae-4642-f88c-60d8b8ede968@FreeBSD.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Thu, Mar 14, 2019 at 11:07:50AM +0000, Johannes Lundberg wrote:
> 
> On 3/13/19 11:39 PM, John Baldwin wrote:
> > On 3/13/19 1:36 PM, Conrad Meyer wrote:
> >> Hi,
> >>
> >> A lot of the information about PCIe devices is read by PCI probe and
> >> cached on the (BSD) device.  You could access it out of
> >> device_get_ivars(bsddev)->cfg.pcie and avoid the MMIO latency.
> > Agreed when possible, though I'm not sure caching lnkcap is really
> > all that useful.
> >
> >> On Wed, Mar 13, 2019 at 12:15 PM Hans Petter Selasky
> >> <hselasky@freebsd.org> wrote:
> >>> +static inline enum pci_bus_speed
> >>> +pcie_get_speed_cap(struct pci_dev *dev)
> >>> +{
> >>> +       device_t root;
> >>> +       uint32_t lnkcap, lnkcap2;
> >>> +       int error, pos;
> >>> +
> >>> +       root = device_get_parent(dev->dev.bsddev);
> >>> +       if (root == NULL)
> >>> +               return (PCI_SPEED_UNKNOWN);
> >>> +       root = device_get_parent(root);
> >>> +       if (root == NULL)
> >>> +               return (PCI_SPEED_UNKNOWN);
> >>> +       root = device_get_parent(root);
> >>> +       if (root == NULL)
> >>> +               return (PCI_SPEED_UNKNOWN);
> >> What is this mechanism trying to accomplish?  It seems incredibly
> >> fragile.  Looking for pci0? pcib0?
> > It does seem broken, or maybe that it only works for DRM drivers and nothing else.
> > For non-DRM drivers, 'bsddev' is a PCI-e endpoint.  You would then go up two layers
> > (pciX and pcibX) to get to the parent bridge.  However, this is assuming a DRM layout
> > where it has to go drm0 -> vgapci0 -> pciX -> pcibX due to the vgapci pseudo-driver.
> > As a result, this function can't possibly work on anything other than a DRM driver.
> > Since it would try to use pci ivars on some PCI bus instance (or worse), it's likely
> > to cause random panics if used from a non-DRM driver.
> >
> > Furthermore, it's not at all clear to me why you can't just read lnkcap/lnkcap2 from
> > the endpoint device directly vs heading up to the parent bridge though.  Hmm, I guess
> > it's trying to infer the capabilities of the "slot", so that's why it needs the
> > grandparent.  You will have to do something less fragile to find the grandparent.
> > The simplest way may be to walk up to the first "pcib" device you see and then
> > stop.  You will still want to see if it's a "real" PCI device by seeing if it's
> > parent is a "pci" device (meaning that the "pcib" device will have valid PCI IVARs
> > and PCI config registers you can read).  pci_find_pcie_root_port has some similar
> > code for this, but that function would walk too far up for what you want here, so you
> > can't reuse it as-is.
> >
> >>> +       if ((error = pci_find_cap(root, PCIY_EXPRESS, &pos)) != 0)
> >>> +               return (PCI_SPEED_UNKNOWN);
> >> Cached as non-zero cfg.pcie.pcie_location value in ivars.
> >>
> >>> +       lnkcap2 = pci_read_config(root, pos + PCIER_LINK_CAP2, 4);
> >> This one we don't cache, unfortunately, but could.  Ditto LINK_CAP
> >> below.  Usually "pos + PCIEfoo" is spelled "pcie_read_config(...,
> >> PCIEfoo...)."
> > Yes, pcie_read_config is what you normally would use.  That uses the cached
> > pcie_location for you.
> >
> >      pcie_read_config(dev->dev.bsddev, PCIER_LINK_CAP2, 2);
> >
> > But also, you should be checking the PCIe version to see if this register
> > exists instead of reading it unconditionally.  Specifically, you should read
> > the version field (PCIEM_FLAGS_VERSION) from PCIER_FLAGS.  LINK_CAP2 only
> > exists if the version >= 2.
> >
> >>> +
> >>> +       if (lnkcap2) {  /* PCIe r3.0-compliant */
> >>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_2_5GB)
> >>> +                       return (PCIE_SPEED_2_5GT);
> >> Seems like these definitions would be better suited as native
> >> PCIEM_LINK_CAP2_foo definitions in pcireg.h
> > I feel less strongly about those since we have to provide the constants
> > anyway for consumers to use.
> >
> >>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_5_0GB)
> >>> +                       return (PCIE_SPEED_5_0GT);
> >>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_8_0GB)
> >>> +                       return (PCIE_SPEED_8_0GT);
> >>> +               if (lnkcap2 & PCI_EXP_LNKCAP2_SLS_16_0GB)
> >>> +                       return (PCIE_SPEED_16_0GT);
> >>> +       } else {        /* pre-r3.0 */
> >>> +               lnkcap = pci_read_config(root, pos + PCIER_LINK_CAP, 4);
> >>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_2_5GB)
> >>> +                       return (PCIE_SPEED_2_5GT);
> >>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_5_0GB)
> >>> +                       return (PCIE_SPEED_5_0GT);
> >>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_8_0GB)
> >>> +                       return (PCIE_SPEED_8_0GT);
> >>> +               if (lnkcap & PCI_EXP_LNKCAP_SLS_16_0GB)
> >>> +                       return (PCIE_SPEED_16_0GT);
> >>> +       }
> >>> +       return (PCI_SPEED_UNKNOWN);
> >>> +}
> >>> +
> >>> +static inline enum pcie_link_width
> >>> +pcie_get_width_cap(struct pci_dev *dev)
> >>> +{
> >>> +       uint32_t lnkcap;
> >>> +
> >>> +       pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap);
> >> Better spelled as PCIER_LINK_CAP.
> >>
> >>> +       if (lnkcap)
> >>> +               return ((lnkcap & PCI_EXP_LNKCAP_MLW) >> 4);
> >> And PCIEM_LINK_CAP_MAX_WIDTH.
> > Given that this is using a linuxkpi API (pcie_capability_read_dword
> > instead of pcie_read_config()) then the rest of this function is
> > written in the Linux KPI, so I think using Linux' native constants
> > is actually more consistent for this function.
> 
> Hi
> 
> Thanks for the feedback.
> 
> This code used to exist in the drm driver but was moved into kernel pci
> code on Linux. It's more or less as it existed in the drm driver with
> FreeBSD patches having roots probably going back to drm2 from base. If
No, I assure you that it did not appeared in drm2.

> this function won't be compatible with anything but drm drivers, we
> could just as well move it to linuxkpi in ports instead.
It only works for GPUs which are either pseudo-PCIe devices, i.e.
they represent itself as PCIe-attached devices for configuration purposes
but really attached to the CPU ring directly.  This is true for CPU-
integrated graphics.  Or for lucky circumstances where external
GPU is plugged into CPU PCIe slot without any PCIe bridges.

If PCIe GPU is connected through a switch (which happen quite often
these days), or into the slot managed by south bridge (I believe this
is theoretically possible) then the code would not work.

Perhaps you want dmar_get_requester() which does exactly what you
(seems) try to get for PCIe devices attached without non-PCIe bridges.

> 
> I'll try to implement the suggested improvements. Keep on eye out for
> updates here https://reviews.freebsd.org/D19565
> 
> /Johannes
> 
> 
> 
> 



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20190314114529.GE2492>