From owner-freebsd-arch@FreeBSD.ORG Sun Aug 1 18:03:19 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3AB0E16A4CE for ; Sun, 1 Aug 2004 18:03:19 +0000 (GMT) Received: from pooker.samsco.org (pooker.samsco.org [168.103.85.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7DB0643D5D for ; Sun, 1 Aug 2004 18:03:18 +0000 (GMT) (envelope-from scottl@samsco.org) Received: from [192.168.0.201] ([192.168.0.201]) (authenticated bits=0) by pooker.samsco.org (8.12.11/8.12.10) with ESMTP id i71IAZSW031053 for ; Sun, 1 Aug 2004 12:10:36 -0600 (MDT) (envelope-from scottl@samsco.org) Message-ID: <410D2FEA.5050504@samsco.org> Date: Sun, 01 Aug 2004 12:01:14 -0600 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.7.1) Gecko/20040801 X-Accept-Language: en-us, en MIME-Version: 1.0 To: arch@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=0.0 required=3.8 tests=none autolearn=no version=2.63 X-Spam-Checker-Version: SpamAssassin 2.63 (2004-01-11) on pooker.samsco.org Subject: PCI-Express support X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 01 Aug 2004 18:03:19 -0000 All, I've emailed before about supporting various aspects of PCI-Express and especially MSI, but haven't really gotten too far with it due to lack of resources. I now how access to a system that can do PCI-Express (PCI-E) so I'd like to revisit it and see what can be added for 5-STABLE. There are three general areas that need to be addressed in some form or another: Enhanced Configuration Space: PCI-E introduces an enhanced PCI Configuration space that allows for each function to have 4096 bytes of space instead of just 256. The Intel Lindenhurst chipset exposes this space via a memory-mapped window instead of the old slow type 1/2 ioport configuration methods. It appears that if the northbridge supports the enhanced config space then all PCI, PCI-X, and PCI-E devices will show up in it as well as in the legacy space. Proper support likely entails splitting up the pci host-bridge drivers so that a given ACPI or legacy front-end can plug into a given enhanced or legacy configuration layer. This definitely is not going to happen in time for 5.3, though. A hack that could work for 5-STABLE would be to provide pcie_[read|write]_config() methods that would compliment the existing pci methods and be available for drivers that want to access the >255 configuration addresses. Devices are already showing up that want to use these registers, btw. The mechanics of doing this would involve using pmap_mapdev() to map in the range that is specific to each function, and then hang this information off of the pcicfg structure. It's a bit hackish, yes, but it does seem to work in tests that a colleague of mine has done. MSI: I've bantered around different suggestions for an API that will support this. The basic thing that a driver needs from this is to know exactly how many message interrupt vectors are available to it. It can't just register vectors and handlers blindly since the purpose of MSI is to assign special meanings to each vector and allow the driver to handle each one in specifically. In order to keep the API as consistent as possible between classic interrupt sources and MSI sources, I'd like to add a new bus method: int bus_reserve_resource(device_t, int *start, int *end, int *count, int flags); start, end, and count would be passed is as the desired range and would map to the per-function interrupt index in MSI. On return, the range supported and negotiated by the OS, bus, and function would be filled into these values. flags would be something like SYS_RES_MESSAGE. Internal failure of the function would be given in the return value. Whether failure to support MSI should be given as an error code return value can be debated. This function will also program the MSI configuration registers on the device to use the correct message cookie and number of messages. Interrupt registration would then proceed as normal with paired calls to bus_alloc_resource() and bus_setup_intr() for each desired interrupt index. The individual function interrupt index would be used as the start and end parameters to bus_alloc_resource(), and the type parameter would be SYS_RES_MESSAGE instead of SYS_RES_IRQ. bus_setup_intr() would unmask the source in the MSI APIC just like normal. Adding this for 5.3 is feasible, I think, and doesn't add a whole lot of risk. PCI-E provides a legacy mde for interrupts that simulates PCI interrupt lines, so drivers can choose whether to use MSI or the legacy interrupt methods. Hot-Plug, lane status, lane bonding: We don't have the infrastructure to support PCI or PCI-E hot-plug. It's also debatable whether this information will actually be available in a standard form. The PCI-E spec defines a new extended capabilities structure in the config space that can provide some of this information, but these kinds of things have a history of the vendors choosing their own proprietary methods and ignoring the standard. In short, we can't deal with this in the short term at all, and likely not in the long term without significant work to the bus and device infrastructure.