From owner-freebsd-current@FreeBSD.ORG  Tue Jul 24 17:51:18 2012
Return-Path: <owner-freebsd-current@FreeBSD.ORG>
Delivered-To: current@freebsd.org
Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52])
	by hub.freebsd.org (Postfix) with ESMTP id A9E151065673;
	Tue, 24 Jul 2012 17:51:18 +0000 (UTC)
	(envelope-from gpalmer@freebsd.org)
Received: from noop.in-addr.com (mail.in-addr.com [IPv6:2001:470:8:162::1])
	by mx1.freebsd.org (Postfix) with ESMTP id 700C88FC0A;
	Tue, 24 Jul 2012 17:51:18 +0000 (UTC)
Received: from gjp by noop.in-addr.com with local (Exim 4.77 (FreeBSD))
	(envelope-from <gpalmer@freebsd.org>)
	id 1StjGG-0002bg-Tr; Tue, 24 Jul 2012 13:51:08 -0400
Date: Tue, 24 Jul 2012 13:51:08 -0400
From: Gary Palmer <gpalmer@freebsd.org>
To: Julian Elischer <julian@freebsd.org>
Message-ID: <20120724175108.GC19321@in-addr.com>
References: <500A0E24.80101@freebsd.org>
	<EABF0570-55F1-4758-B0FF-62561FFAC4EF@samsco.org>
	<20120722231234.6f748d05@kan.dyndns.org>
	<F1592617-FBD9-4D2A-80DA-BC8CF5D96F87@bsdimp.com>
	<500D010A.5080808@freebsd.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <500D010A.5080808@freebsd.org>
X-SA-Exim-Connect-IP: <locally generated>
X-SA-Exim-Mail-From: gpalmer@freebsd.org
X-SA-Exim-Scanned: No (on noop.in-addr.com); SAEximRunCond expanded to false
Cc: FreeBSD Current <current@freebsd.org>, Warner Losh <imp@bsdimp.com>
Subject: Re: PCIe hotplug
X-BeenThere: freebsd-current@freebsd.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Discussions about the use of FreeBSD-current
	<freebsd-current.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>, 
	<mailto:freebsd-current-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-current>
List-Post: <mailto:freebsd-current@freebsd.org>
List-Help: <mailto:freebsd-current-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-current>,
	<mailto:freebsd-current-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Jul 2012 17:51:18 -0000

On Mon, Jul 23, 2012 at 12:45:14AM -0700, Julian Elischer wrote:
> On 7/22/12 9:11 PM, Warner Losh wrote:
> >On Jul 22, 2012, at 9:12 PM, Alexander Kabaev wrote:
> >
> >>On Sun, 22 Jul 2012 20:22:33 -0600
> >>Scott Long <scottl@samsco.org> wrote:
> >>
> >>>On Jul 20, 2012, at 8:04 PM, Julian Elischer wrote:
> >>>
> >>>>Is anyone looking at PCIe hotplug support?
> >>>>
> >>>>I'm especially interested if anyone has a strategy for device
> >>>>re-insertion and reassociating the reinserted device with its old
> >>>>device_t so that it gets the same unit number.. (assumes access to
> >>>>a serial number or similar) Even if it is put back into a different
> >>>>slot.
> >>>>
> >>>Would the PCI system be responsible for figuring out this serial
> >>>number?  I don't think that it can, but it's a question to answer, I
> >>>guess.  If it can't then it's up to the driver to generate a unique
> >>>cookie that would be stored by the PCI subsystem.  This cookie would
> >>>have to be based off of data that can be retrieved from the PCI
> >>>config space and/or VPD space, since anything more would require
> >>>resource allocation, which is only allowed in the DEV_ATTACH phase,
> >>>and once you've hit that phase you've already pretty much sealed the
> >>>deal on unit number assignment.
> >>>
> >>>So what would probably happen is that the PCI layer provides a ring
> >>>buffer of cookie storage and a set of accessors for the drivers.  The
> >>>cookies would map to a key-value pair with the device unit name and
> >>>number.  During probe, a driver can look at PCI config space and
> >>>generate a cookie.  That cookie can then be communicated up to the
> >>>PCI layer for storage.  Maybe the driver calls a match routine that
> >>>returns a unit number on match and a store on failure, then the
> >>>driver calls a set_unit_number accessor.  Only the driver that wins
> >>>the bid would win the unit number reassignment or cookie storage.  Or
> >>>maybe the driver passes the cookie up as part of its return code, and
> >>>the match and unit assignment happens automatically.  Drivers that
> >>>don't want to participate in this simply wouldn't, and everything
> >>>would continue to operate the same way.  The two sticky parts are
> >>>rogue/buggy drivers that abuse the api and cause a flood of cookies
> >>>to be generated, and questions on when a unit number is eligible for
> >>>reuse.  For the first one, a ring buffer of cookies would solve the
> >>>immediate problem, but you might still have some risk of drivers
> >>>selectively wrapping the buffer for whatever accidental or evil
> >>>purpose.  For the second problem, maybe a unit number stays
> >>>persistent only if the PCIe hot remove mechanism requests it, and
> >>>then only until the ring-buffer wraps.
> >>>
> >>>Scott
> >>>
> >>I do not think the whole problem as depicted by Julian is even worth
> >>solving. Why keeping any data for the device that might _never_ come
> >>back? What if the device hierarchy just starts from the PCI-e and
> >>extends upwards and user still holds on to some vestiges of a previous
> >>device chain (say, by keeping a character control device sharing the
> >>same unit number open, common practice)? Reusing unit number is much
> >>trickier then, and might not be even possible. So, before one jumps
> >>into 'how', can we agree on 'why' first? When device goes away, it is
> >>not just this device's device_t that is disappearing, it is a whole
> >>tree rooted at that device. I see no point in trying to reconstruct
> >>that.
> >There's a reason that PC Card and CardBus never supported this at all.  
> >The assumption was that reconnecting devices is so cheap that it isn't 
> >worth the bother.  This is true for all but some specialized devices 
> >today: network information is easy to reconstruct, storage drives are easy 
> >to reconfigure (since we already fail all in-flight transactions when the 
> >device goes away), etc.  I can see some advantage to having storage cope, 
> >but there already geom classes that can help people code when drives can 
> >go away.
> >
> >>PCI-e hotplug proper is very much orthogonal to the question of unit
> >>numbering and IS worth supporting.
> >Yes.  totally agreed.
> 
> I'm not saying that it's vitally important but was wondering if people 
> had a strategy for it..
> i.e. is it a question worth worrying about?
> 
> In a separate forum Warner and I (yeah I know I'm answering Warner, 
> but I'm addressing the others) discussed the feasibility  of surviving 
> an "oops pulled the wrong card" event with regards to a particular 
> flash memory card. I was just carrying that forwards as a thought 
> experiment (There is actually a strategy that sounds feasible).
> 
> The problem of getting a serial number out of the BAR space during 
> probe is also possibly solvable in our case but the question of how 
> long to remember a device is legitimate an My answer would be that
> 1/ a particular driver would be able to specify whether it could 
> handle this, and
> 2/ it might be limited to some pragmatic number such as 16 or 32, or a 
> time limit.

Why not extend the geom_label idea further?  If there is a serial
number, can that be exposed via /dev somehow so that the problem is
moved out of the kernel space?  That way devd could say "this serial
number gets symlinked to this disk node" (for example).  

Gary