From owner-freebsd-new-bus@FreeBSD.ORG Tue Feb 23 07:19:04 2010 Return-Path: Delivered-To: freebsd-new-bus@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id D14291065670; Tue, 23 Feb 2010 07:19:04 +0000 (UTC) (envelope-from rajatjain@juniper.net) Received: from exprod7og124.obsmtp.com (exprod7og124.obsmtp.com [64.18.2.26]) by mx1.freebsd.org (Postfix) with ESMTP id 3DCE18FC12; Tue, 23 Feb 2010 07:19:04 +0000 (UTC) Received: from source ([66.129.224.36]) (using TLSv1) by exprod7ob124.postini.com ([64.18.6.12]) with SMTP ID DSNKS4OBZ34W5IL1/4ymY+6O/BE5E9sukt0M@postini.com; Mon, 22 Feb 2010 23:19:04 PST Received: from emailbng1.jnpr.net (10.209.194.15) by P-EMHUB02-HQ.jnpr.net (172.24.192.36) with Microsoft SMTP Server id 8.1.393.1; Mon, 22 Feb 2010 23:16:44 -0800 Received: from emailbng3.jnpr.net ([10.209.194.27]) by emailbng1.jnpr.net with Microsoft SMTPSVC(6.0.3790.3959); Tue, 23 Feb 2010 12:46:41 +0530 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Tue, 23 Feb 2010 12:46:40 +0530 Message-ID: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Strategy for PCI resource management (for supporting hot-plug) Thread-Index: Acq0WCUiblKr9+OlQeafGqGQtTgEow== From: Rajat Jain To: , X-OriginalArrivalTime: 23 Feb 2010 07:16:41.0758 (UTC) FILETIME=[25C9BBE0:01CAB458] Cc: freebsd-ia32@freebsd.org, freebsd-ppc@freebsd.org Subject: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 07:19:04 -0000 Hi, I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step for the PCI-E hotplug support, I'm trying to decide on a resource management / allocation strategy for the PCI memory / IO and the bus numbers. Can you please comment on the following approach that I am considering for resource allocation: PROBLEM STATEMENT: ------------------ Given a memory range [A->B], IO range [C->D], and limited (256) bus numbers, enumerate the PCI tree of a system, leaving enough "holes" in between to allow addition of future devices. PROPOSED STRATEGY: ------------------ 1) When booting, start enumerating in a depth-first-search order. While enumeration, always keep track of: * The next bus number (x) that can be allocated * The next Memory space pointer (A + y) starting which allocation can be=20 done. ("y" is the memory already allocated). * The next IO Space pointer (C + z) starting which allocation can be done. ("z" is the IO space already allocated). Keep incrementing the above as the resources are allocated. 2) Allocate bus numbers sequentially while traversing down from root to a leaf node (end point). When going down traversing a bridge: * Allocate the next available bus number (x) to the secondary bus of=20 bridge. * Temporarily mark the subordinate bridge as 0xFF (to allow discovery of=20 maximum buses). * Temporarily assign all the remaining available memory space to bridge [(A+x) -> B]. Ditto for IO space. 3) When a leaf node (End point) is reached, allocate the memory / IO resource requested by the device, and increment the pointers.=20 4) While passing a bridge in the upward direction, tweak the bridge registers such that its resources are ONLY ENOUGH to address the needs of all the PCI tree below it, and if it has its own internal memory mapped registers, some memory for it as well. The above is the standard depth-first algorithm for resource allocation. Here is the addition to support hot-plug: At each bridge that supports hot-plug, in addition to the resources that would have normally been allocated to this bridge, additionally pre-allocate and assign to bridge (in anticipation of any new devices that may be added later): a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees=20 present on the device plugged. b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices that=20 may be attached later on. c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may be=20 attached later on. Please note that the above RSRVE* are constants defining the amount of resources to be set aside for /below each HOT-PLUGGABLE bridge; their values may be tweaked via a compile time option or via a sysctl.=20 FEW COMMENTS ------------ =20 1) The strategy is fairly generic and tweak-able since it does not waste a lot of resources (The developer neds to pick up a smart bvalue for howmuch resources to reserve at each hot-pluggable slot): * The reservations shall be done only for hot-pluggable bridges * The developer can tweak the values (even disable it) for how much=20 Resources shall be allocated for each hot-pluggable bridge. =20 2) One point of debate is what happens if there are too much resource demands in the system (too many devices or the developer configures too many resources to be allocated for each hot-pluggable devices). For e.g. consider that while enumeration we find that all the resources are already allocated, while there are more devices that need resources. So do we simply do not enumerate them? Etc... Overall, how does the above look? Thanks & Best Regards, Rajat Jain From owner-freebsd-new-bus@FreeBSD.ORG Tue Feb 23 08:21:14 2010 Return-Path: Delivered-To: freebsd-new-bus@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id BCB0A1065694; Tue, 23 Feb 2010 08:21:14 +0000 (UTC) (envelope-from imp@bsdimp.com) Received: from harmony.bsdimp.com (bsdimp.com [199.45.160.85]) by mx1.freebsd.org (Postfix) with ESMTP id 673CA8FC15; Tue, 23 Feb 2010 08:21:14 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by harmony.bsdimp.com (8.14.3/8.14.1) with ESMTP id o1N8GVCm035900; Tue, 23 Feb 2010 01:16:31 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Tue, 23 Feb 2010 01:16:30 -0700 (MST) Message-Id: <20100223.011630.74715282.imp@bsdimp.com> To: rajatjain@juniper.net From: Warner Losh In-Reply-To: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> References: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> X-Mailer: Mew version 3.3 on Emacs 21.3 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: freebsd-ia32@FreeBSD.org, freebsd-new-bus@FreeBSD.org, freebsd-ppc@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 08:21:14 -0000 From: Rajat Jain Subject: Strategy for PCI resource management (for supporting hot-plug) Date: Tue, 23 Feb 2010 12:46:40 +0530 > > Hi, > > I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step > for the PCI-E hotplug support, I'm trying to decide on a resource > management / allocation strategy for the PCI memory / IO and the bus > numbers. Can you please comment on the following approach that I am > considering for resource allocation: > > PROBLEM STATEMENT: > ------------------ > Given a memory range [A->B], IO range [C->D], and limited (256) bus > numbers, enumerate the PCI tree of a system, leaving enough "holes" in > between to allow addition of future devices. > > PROPOSED STRATEGY: > ------------------ > 1) When booting, start enumerating in a depth-first-search order. While > enumeration, always keep track of: > > * The next bus number (x) that can be allocated > > * The next Memory space pointer (A + y) starting which allocation can > be > done. ("y" is the memory already allocated). > > * The next IO Space pointer (C + z) starting which allocation can be > done. > ("z" is the IO space already allocated). > > Keep incrementing the above as the resources are allocated. IO space and memory space are bus addresses, which may have a mapping to another domain. > 2) Allocate bus numbers sequentially while traversing down from root to > a leaf node (end point). When going down traversing a bridge: > > * Allocate the next available bus number (x) to the secondary bus of > bridge. > > * Temporarily mark the subordinate bridge as 0xFF (to allow discovery > of > maximum buses). > > * Temporarily assign all the remaining available memory space to bridge > > [(A+x) -> B]. Ditto for IO space. I'm sure this is wise. > 3) When a leaf node (End point) is reached, allocate the memory / IO > resource requested by the device, and increment the pointers. keep in mind that devices may not have drivers allocataed to them at bus enumeration of time. with hot-plug devices, you might not even know all the devices that are there or could be there. > 4) While passing a bridge in the upward direction, tweak the bridge > registers such that its resources are ONLY ENOUGH to address the needs > of all the PCI tree below it, and if it has its own internal memory > mapped registers, some memory for it as well. How does one deal with adding a device that has a bridge on it? I think that the only enough part is likely going to lead to prroblems as you'll need to move other resources if a new device arrives here. > The above is the standard depth-first algorithm for resource allocation. > Here is the addition to support hot-plug: the above won't quite work for cardbus :) But that's a hot-plug device... > At each bridge that supports hot-plug, in addition to the resources that > would have normally been allocated to this bridge, additionally > pre-allocate and assign to bridge (in anticipation of any new devices > that may be added later): In addition, or total? if it were total, you could more easily allocate memory or io space ranges in a more determnistic way when you have to deal with booting with or without a device that's present. > a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees > present on the device plugged. This one might make sense, but if we have multiple levels then you'll run out. if you have 4 additional bridges, you can't allocate X additional busses at the root, then you can only (X-4)/4 at each level. > b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices > that > may be attached later on. > > c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may > be > attached later on. similar comments apply here. > Please note that the above RSRVE* are constants defining the amount of > resources to be set aside for /below each HOT-PLUGGABLE bridge; their > values may be tweaked via a compile time option or via a sysctl. > > FEW COMMENTS > ------------ > > 1) The strategy is fairly generic and tweak-able since it does not waste > a lot of resources (The developer neds to pick up a smart bvalue for > howmuch resources to reserve at each hot-pluggable slot): > > * The reservations shall be done only for hot-pluggable bridges > > * The developer can tweak the values (even disable it) for how much > Resources shall be allocated for each hot-pluggable bridge. I'd like to understand the details of this better. especially when you have multiple layers where devices that have bridges are hot-plugged into the system. For example, three's a cardbus to pci bridge, which has 3 PCI slots behind it. These slots may have, say, a quad ethernet card which has a pci bridge to allow the 4 pci nics behind it. New while this example may be dated, newer pci-e also allows for it... > 2) One point of debate is what happens if there are too much resource > demands in the system (too many devices or the developer configures too > many resources to be allocated for each hot-pluggable devices). For e.g. > consider that while enumeration we find that all the resources are > already allocated, while there are more devices that need resources. So > do we simply do not enumerate them? Etc... How is this different than normal resource failure? And how will you know at initial enumearation what devices will be plugged in? > Overall, how does the above look? In general, it looks fairly good. I'm just worried about the multiple layer case :) Warner > Thanks & Best Regards, > > Rajat Jain > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-new-bus@FreeBSD.ORG Tue Feb 23 14:02:44 2010 Return-Path: Delivered-To: freebsd-new-bus@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9B031065670; Tue, 23 Feb 2010 14:02:44 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from mail-iw0-f185.google.com (mail-iw0-f185.google.com [209.85.223.185]) by mx1.freebsd.org (Postfix) with ESMTP id 906868FC1B; Tue, 23 Feb 2010 14:02:44 +0000 (UTC) Received: by iwn15 with SMTP id 15so3078475iwn.7 for ; Tue, 23 Feb 2010 06:02:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:received:in-reply-to :references:date:x-google-sender-auth:message-id:subject:from:to:cc :content-type; bh=CDL/sZejUu4e5XX6eURcWG9ZHhZNnOYYNcRg2XDBSVc=; b=spixNNFzS6IKAJm4B+Qb1rP08mvDFHf8jv7dxnETZ183Qf56xjDhl7ikEiQ41uk+L6 so9+FOpmedQkMjnKNOBuHX01a3KSj/WrFdWpoZcdjNKZhO5+YdOCEIik1RCYHCfXurXU 9YvzMu3kYy4eYNkEJIxNjfB6xHXCwcN9+ibfI= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=U+Funx24cSSbuve18uOPYTbvf91fm0xbIITeo+P1ADNd+0BEuvTUkqG9Sdx7zUDmDM g6lTnuW50sJgTYHjT9Z6BnSogY3ajgggTAkZm52NEqw1Np65k0yMnQSgzSMtMO14ycSe gLCwmnYGRtD6a9kHeQiQgmMhNAvla/JtOL4AE= MIME-Version: 1.0 Sender: asmrookie@gmail.com Received: by 10.231.154.207 with SMTP id p15mr227861ibw.71.1266933759014; Tue, 23 Feb 2010 06:02:39 -0800 (PST) In-Reply-To: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> References: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> Date: Tue, 23 Feb 2010 15:02:38 +0100 X-Google-Sender-Auth: 1238ac7ec9e8960b Message-ID: <3bbf2fe11002230602t28701370l2712f836ebaee03@mail.gmail.com> From: Attilio Rao To: Rajat Jain Content-Type: text/plain; charset=UTF-8 Cc: freebsd-ia32@freebsd.org, freebsd-new-bus@freebsd.org, freebsd-ppc@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 14:02:45 -0000 2010/2/23 Rajat Jain : > > Hi, > > I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step > for the PCI-E hotplug support, I'm trying to decide on a resource > management / allocation strategy for the PCI memory / IO and the bus > numbers. Can you please comment on the following approach that I am > considering for resource allocation: You may also coordinate with jhb@ which is working on a multipass layer for improving resource mapping/allocation. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From owner-freebsd-new-bus@FreeBSD.ORG Tue Feb 23 17:04:34 2010 Return-Path: Delivered-To: freebsd-new-bus@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id B6DF21065672; Tue, 23 Feb 2010 17:04:34 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from cyrus.watson.org (cyrus.watson.org [65.122.17.42]) by mx1.freebsd.org (Postfix) with ESMTP id 877AB8FC0A; Tue, 23 Feb 2010 17:04:34 +0000 (UTC) Received: from bigwig.baldwin.cx (66.111.2.69.static.nyinternet.net [66.111.2.69]) by cyrus.watson.org (Postfix) with ESMTPSA id 022D046B03; Tue, 23 Feb 2010 12:04:34 -0500 (EST) Received: from jhbbsd.localnet (smtp.hudson-trading.com [209.249.190.9]) by bigwig.baldwin.cx (Postfix) with ESMTPA id 859EF8A021; Tue, 23 Feb 2010 12:04:21 -0500 (EST) From: John Baldwin To: freebsd-arch@freebsd.org Date: Tue, 23 Feb 2010 10:27:20 -0500 User-Agent: KMail/1.12.1 (FreeBSD/7.2-CBSD-20100120; KDE/4.3.1; amd64; ; ) References: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> In-Reply-To: <8506939B503B404A84BBB12293FC45F606B88C39@emailbng3.jnpr.net> MIME-Version: 1.0 Content-Type: Text/Plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Message-Id: <201002231027.20749.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.0.1 (bigwig.baldwin.cx); Tue, 23 Feb 2010 12:04:21 -0500 (EST) X-Virus-Scanned: clamav-milter 0.95.1 at bigwig.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-1.3 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.2.5 X-Spam-Checker-Version: SpamAssassin 3.2.5 (2008-06-10) on bigwig.baldwin.cx Cc: freebsd-ia32@freebsd.org, freebsd-new-bus@freebsd.org, freebsd-ppc@freebsd.org Subject: Re: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Feb 2010 17:04:34 -0000 On Tuesday 23 February 2010 2:16:40 am Rajat Jain wrote: > > Hi, > > I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step > for the PCI-E hotplug support, I'm trying to decide on a resource > management / allocation strategy for the PCI memory / IO and the bus > numbers. Can you please comment on the following approach that I am > considering for resource allocation: > > PROBLEM STATEMENT: > ------------------ > Given a memory range [A->B], IO range [C->D], and limited (256) bus > numbers, enumerate the PCI tree of a system, leaving enough "holes" in > between to allow addition of future devices. > > PROPOSED STRATEGY: > ------------------ > 1) When booting, start enumerating in a depth-first-search order. While > enumeration, always keep track of: > > * The next bus number (x) that can be allocated > > * The next Memory space pointer (A + y) starting which allocation can > be > done. ("y" is the memory already allocated). > > * The next IO Space pointer (C + z) starting which allocation can be > done. > ("z" is the IO space already allocated). > > Keep incrementing the above as the resources are allocated. > > 2) Allocate bus numbers sequentially while traversing down from root to > a leaf node (end point). When going down traversing a bridge: > > * Allocate the next available bus number (x) to the secondary bus of > bridge. > > * Temporarily mark the subordinate bridge as 0xFF (to allow discovery > of > maximum buses). > > * Temporarily assign all the remaining available memory space to bridge > > [(A+x) -> B]. Ditto for IO space. > > 3) When a leaf node (End point) is reached, allocate the memory / IO > resource requested by the device, and increment the pointers. > > 4) While passing a bridge in the upward direction, tweak the bridge > registers such that its resources are ONLY ENOUGH to address the needs > of all the PCI tree below it, and if it has its own internal memory > mapped registers, some memory for it as well. > > The above is the standard depth-first algorithm for resource allocation. > Here is the addition to support hot-plug: > > At each bridge that supports hot-plug, in addition to the resources that > would have normally been allocated to this bridge, additionally > pre-allocate and assign to bridge (in anticipation of any new devices > that may be added later): > > a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees > present on the device plugged. > > b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices > that > may be attached later on. > > c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may > be > attached later on. > > Please note that the above RSRVE* are constants defining the amount of > resources to be set aside for /below each HOT-PLUGGABLE bridge; their > values may be tweaked via a compile time option or via a sysctl. > > FEW COMMENTS > ------------ > > 1) The strategy is fairly generic and tweak-able since it does not waste > a lot of resources (The developer neds to pick up a smart bvalue for > howmuch resources to reserve at each hot-pluggable slot): > > * The reservations shall be done only for hot-pluggable bridges > > * The developer can tweak the values (even disable it) for how much > Resources shall be allocated for each hot-pluggable bridge. > > 2) One point of debate is what happens if there are too much resource > demands in the system (too many devices or the developer configures too > many resources to be allocated for each hot-pluggable devices). For e.g. > consider that while enumeration we find that all the resources are > already allocated, while there are more devices that need resources. So > do we simply do not enumerate them? Etc... > > Overall, how does the above look? I think one wrinkle is that we should try to preserve the resources that the firmware has set for devices, at least on x86. I had also wanted to make use of multipass for this, but that requires a bit more work to split the PCI bus attach up into separate steps. -- John Baldwin From owner-freebsd-new-bus@FreeBSD.ORG Wed Feb 24 01:36:31 2010 Return-Path: Delivered-To: freebsd-new-bus@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 34819106566C; Wed, 24 Feb 2010 01:36:31 +0000 (UTC) (envelope-from babkin@verizon.net) Received: from vms173019pub.verizon.net (vms173019pub.verizon.net [206.46.173.19]) by mx1.freebsd.org (Postfix) with ESMTP id F256C8FC13; Wed, 24 Feb 2010 01:36:30 +0000 (UTC) Received: from vms061.mailsrvcs.net ([unknown] [192.168.1.2]) by vms173019.mailsrvcs.net (Sun Java(tm) System Messaging Server 7u2-7.02 32bit (built Apr 16 2009)) with ESMTPA id <0KYB00JFIG3V29A5@vms173019.mailsrvcs.net>; Tue, 23 Feb 2010 16:35:56 -0600 (CST) Received: from 65.242.108.162 ([65.242.108.162]) by vms061.mailsrvcs.net (Verizon Webmail) with HTTP; Tue, 23 Feb 2010 16:35:55 -0600 (CST) Date: Tue, 23 Feb 2010 16:35:55 -0600 (CST) From: Sergey Babkin To: jhb@freebsd.org Message-id: <24099271.347187.1266964555857.JavaMail.root@vms061.mailsrvcs.net> MIME-version: 1.0 Content-type: text/plain; charset=UTF-8 Content-transfer-encoding: 7bit X-Originating-IP: [65.242.108.162] Cc: freebsd-ia32@freebsd.org, freebsd-new-bus@freebsd.org, freebsd-ppc@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Re: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Feb 2010 01:36:31 -0000 (Sorry, if the email comes out looking weird, I want to give another try to see if the provider has fixed the formatting issues i nthe web interface or not). On Tuesday 23 February 2010 2:16:40 am Rajat Jain wrote: > > Hi, > > I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step > for the PCI-E hotplug support, I'm trying to decide on a resource > management / allocation strategy for the PCI memory / IO and the bus > numbers. Can you please comment on the following approach that I am > considering for resource allocation: > > PROBLEM STATEMENT: > ------------------ > Given a memory range [A->B], IO range [C->D], and limited (256) bus > numbers, enumerate the PCI tree of a system, leaving enough "holes" in > between to allow addition of future devices. > > PROPOSED STRATEGY: > ------------------ > 1) When booting, start enumerating in a depth-first-search order. While > enumeration, always keep track of: > > * The next bus number (x) that can be allocated > > * The next Memory space pointer (A + y) starting which allocation can > be > done. ("y" is the memory already allocated). > > * The next IO Space pointer (C + z) starting which allocation can be > done. > ("z" is the IO space already allocated). > > Keep incrementing the above as the resources are allocated. > > 2) Allocate bus numbers sequentially while traversing down from root to > a leaf node (end point). When going down traversing a bridge: > > * Allocate the next available bus number (x) to the secondary bus of > bridge. > > * Temporarily mark the subordinate bridge as 0xFF (to allow discovery > of > maximum buses). > > * Temporarily assign all the remaining available memory space to bridge > > [(A+x) -> B]. Ditto for IO space. > > 3) When a leaf node (End point) is reached, allocate the memory / IO > resource requested by the device, and increment the pointers. > > 4) While passing a bridge in the upward direction, tweak the bridge > registers such that its resources are ONLY ENOUGH to address the needs > of all the PCI tree below it, and if it has its own internal memory > mapped registers, some memory for it as well. > > The above is the standard depth-first algorithm for resource allocation. > Here is the addition to support hot-plug: > > At each bridge that supports hot-plug, in addition to the resources that > would have normally been allocated to this bridge, additionally > pre-allocate and assign to bridge (in anticipation of any new devices > that may be added later): > > a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees > present on the device plugged. > > b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices > that > may be attached later on. > > c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may > be > attached later on. A kind of stupid question: should the reserve amounts depend on the level of the bridge? Perhaps the priidges closer to the root should get more reserves. Perhaps it doesn't matter so much durin gthe initial enumeration but ma matter latter after a hot plug. Suppose we have the Bridge B1 that gets RSRVE resources attached to it during the initial enumeration. Then someone comes and hot-plugs a bridge B2 under B1. B2 then I guess will also try to get a reserve of RSRVE resources for itself, so it would take the whole original reserve of B1 to itself. If someone comes later and tries to hot-plug another bridge B3 under B1, that bridge would not get any resources and the plugging would fail. -SB From owner-freebsd-new-bus@FreeBSD.ORG Wed Feb 24 04:35:19 2010 Return-Path: Delivered-To: freebsd-new-bus@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id CDADB106566B; Wed, 24 Feb 2010 04:35:19 +0000 (UTC) (envelope-from rajatjain@juniper.net) Received: from exprod7og124.obsmtp.com (exprod7og124.obsmtp.com [64.18.2.26]) by mx1.freebsd.org (Postfix) with ESMTP id CCD708FC16; Wed, 24 Feb 2010 04:35:18 +0000 (UTC) Received: from source ([66.129.224.36]) (using TLSv1) by exprod7ob124.postini.com ([64.18.6.12]) with SMTP ID DSNKS4SshmVUaDSyL+FTm5ptlGPCuZEBf9AD@postini.com; Tue, 23 Feb 2010 20:35:19 PST Received: from gaugeboson.jnpr.net (10.209.194.17) by P-EMHUB03-HQ.jnpr.net (172.24.192.37) with Microsoft SMTP Server id 8.1.393.1; Tue, 23 Feb 2010 20:32:39 -0800 Received: from emailbng3.jnpr.net ([10.209.194.27]) by gaugeboson.jnpr.net with Microsoft SMTPSVC(6.0.3790.3959); Wed, 24 Feb 2010 10:02:36 +0530 X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-Class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Date: Wed, 24 Feb 2010 10:02:35 +0530 Message-ID: <8506939B503B404A84BBB12293FC45F606B88E55@emailbng3.jnpr.net> In-Reply-To: <24099271.347187.1266964555857.JavaMail.root@vms061.mailsrvcs.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Re: Strategy for PCI resource management (for supporting hot-plug) Thread-Index: Acq02NJfpnyCYOSsRvKUNhQqcZ9VZQAMFnBg References: <24099271.347187.1266964555857.JavaMail.root@vms061.mailsrvcs.net> From: Rajat Jain To: Sergey Babkin , X-OriginalArrivalTime: 24 Feb 2010 04:32:36.0576 (UTC) FILETIME=[6403EE00:01CAB50A] Cc: freebsd-ia32@freebsd.org, freebsd-new-bus@freebsd.org, freebsd-ppc@freebsd.org, freebsd-arch@freebsd.org Subject: RE: Re: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 24 Feb 2010 04:35:19 -0000 Hello Sergey, > A kind of stupid question: should the reserve amounts depend on the level > of the bridge? > Perhaps the priidges closer to the root should get more reserves. Perhaps > it doesn't > matter so much durin gthe initial enumeration but ma matter latter after a > hot plug. One clarification perhaps I did not give in my proposal. We would reserve resources (bus numbers / memory / IO) only for bridges that are CAPABLE of HOT-PLUG. The rest of the bridges would get their usual share of resources.=20 Now, the same amount of reserved resources gets assigned to each HOT-PLUG capable bridge, irrespective of at which level it is in hierarchy. This is because no matter where it is, there is equal probability of a new card being plugged in at ANY of those slots.=20 The only problem as you say is when we plug in a PCI card, which has a HOT-PLUGGABLE SLOT on it (on which we can plug in more cards). (This is because a bridge wants extra reserved resources only when it is hot-plug capable). Do such devices exist? Since theoretically possible, but practically extremely rare, I say we do not support this case. Comments? Thanks, Rajat Jain > -----Original Message----- > From: Sergey Babkin [mailto:babkin@verizon.net] > Sent: Wednesday, February 24, 2010 4:06 AM > To: jhb@freebsd.org > Cc: freebsd-arch@freebsd.org; Rajat Jain; freebsd-ia32@freebsd.org; > freebsd-new-bus@freebsd.org; freebsd-ppc@freebsd.org > Subject: Re: Re: Strategy for PCI resource management (for supporting hot- > plug) >=20 > (Sorry, if the email comes out looking weird, I want to give another try > to see if the provider has > fixed the formatting issues i nthe web interface or not). >=20 > On Tuesday 23 February 2010 2:16:40 am Rajat Jain wrote: > > > > Hi, > > > > I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step > > for the PCI-E hotplug support, I'm trying to decide on a resource > > management / allocation strategy for the PCI memory / IO and the bus > > numbers. Can you please comment on the following approach that I am > > considering for resource allocation: > > > > PROBLEM STATEMENT: > > ------------------ > > Given a memory range [A->B], IO range [C->D], and limited (256) bus > > numbers, enumerate the PCI tree of a system, leaving enough "holes" in > > between to allow addition of future devices. > > > > PROPOSED STRATEGY: > > ------------------ > > 1) When booting, start enumerating in a depth-first-search order. While > > enumeration, always keep track of: > > > > * The next bus number (x) that can be allocated > > > > * The next Memory space pointer (A + y) starting which allocation can > > be > > done. ("y" is the memory already allocated). > > > > * The next IO Space pointer (C + z) starting which allocation can be > > done. > > ("z" is the IO space already allocated). > > > > Keep incrementing the above as the resources are allocated. > > > > 2) Allocate bus numbers sequentially while traversing down from root to > > a leaf node (end point). When going down traversing a bridge: > > > > * Allocate the next available bus number (x) to the secondary bus of > > bridge. > > > > * Temporarily mark the subordinate bridge as 0xFF (to allow discovery > > of > > maximum buses). > > > > * Temporarily assign all the remaining available memory space to bridge > > > > [(A+x) -> B]. Ditto for IO space. > > > > 3) When a leaf node (End point) is reached, allocate the memory / IO > > resource requested by the device, and increment the pointers. > > > > 4) While passing a bridge in the upward direction, tweak the bridge > > registers such that its resources are ONLY ENOUGH to address the needs > > of all the PCI tree below it, and if it has its own internal memory > > mapped registers, some memory for it as well. > > > > The above is the standard depth-first algorithm for resource allocation. > > Here is the addition to support hot-plug: > > > > At each bridge that supports hot-plug, in addition to the resources that > > would have normally been allocated to this bridge, additionally > > pre-allocate and assign to bridge (in anticipation of any new devices > > that may be added later): > > > > a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees > > present on the device plugged. > > > > b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices > > that > > may be attached later on. > > > > c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may > > be > > attached later on. >=20 > A kind of stupid question: should the reserve amounts depend on the level > of the bridge? > Perhaps the priidges closer to the root should get more reserves. Perhaps > it doesn't > matter so much durin gthe initial enumeration but ma matter latter after a > hot plug. >=20 > Suppose we have the Bridge B1 that gets RSRVE resources attached to it > during > the initial enumeration. Then someone comes and hot-plugs a bridge B2 > under B1. > B2 then I guess will also try to get a reserve of RSRVE resources for > itself, so it would > take the whole original reserve of B1 to itself. If someone comes later > and tries > to hot-plug another bridge B3 under B1, that bridge would not get any > resources > and the plugging would fail. >=20 > -SB From owner-freebsd-new-bus@FreeBSD.ORG Thu Feb 25 01:16:43 2010 Return-Path: Delivered-To: freebsd-new-bus@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id E9A7D106564A; Thu, 25 Feb 2010 01:16:43 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.freebsd.org (Postfix) with ESMTP id B04178FC12; Thu, 25 Feb 2010 01:16:43 +0000 (UTC) Received: from hydrogen.funkthat.com (rfi20yj7t50f7r9j@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id o1P0wgOC012002 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 24 Feb 2010 16:58:43 -0800 (PST) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id o1P0wfDi012001; Wed, 24 Feb 2010 16:58:41 -0800 (PST) (envelope-from jmg) Date: Wed, 24 Feb 2010 16:58:41 -0800 From: John-Mark Gurney To: Rajat Jain Message-ID: <20100225005841.GB58753@funkthat.com> Mail-Followup-To: Rajat Jain , Sergey Babkin , jhb@freebsd.org, freebsd-ia32@freebsd.org, freebsd-new-bus@freebsd.org, freebsd-ppc@freebsd.org, freebsd-arch@freebsd.org References: <24099271.347187.1266964555857.JavaMail.root@vms061.mailsrvcs.net> <8506939B503B404A84BBB12293FC45F606B88E55@emailbng3.jnpr.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8506939B503B404A84BBB12293FC45F606B88E55@emailbng3.jnpr.net> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (hydrogen.funkthat.com [127.0.0.1]); Wed, 24 Feb 2010 16:58:43 -0800 (PST) Cc: freebsd-ia32@freebsd.org, freebsd-new-bus@freebsd.org, Sergey Babkin , freebsd-arch@freebsd.org, freebsd-ppc@freebsd.org Subject: Re: Re: Strategy for PCI resource management (for supporting hot-plug) X-BeenThere: freebsd-new-bus@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: FreeBSD's new-bus architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Feb 2010 01:16:44 -0000 Rajat Jain wrote this message on Wed, Feb 24, 2010 at 10:02 +0530: > The only problem as you say is when we plug in a PCI card, which has a > HOT-PLUGGABLE SLOT on it (on which we can plug in more cards). (This is > because a bridge wants extra reserved resources only when it is hot-plug > capable). Do such devices exist? Since theoretically possible, but > practically extremely rare, I say we do not support this case. There is an ExpressCard PCI-E expansion box (I have one) which you could put a PCI-E cardbus adapter card in which would I believe be the case that you are asking about... Now how many people would do that? Not many, but more might put in multiport PCI-E cards. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."