From owner-freebsd-arch@FreeBSD.ORG Wed Sep 1 19:46:38 2004 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 21D8116A4CE for ; Wed, 1 Sep 2004 19:46:38 +0000 (GMT) Received: from athena.softcardsystems.com (mail.softcardsystems.com [12.34.136.114]) by mx1.FreeBSD.org (Postfix) with ESMTP id 95F0143D1F for ; Wed, 1 Sep 2004 19:46:37 +0000 (GMT) (envelope-from sah@softcardsystems.com) Received: from athena (athena [12.34.136.114])i81KkSRE002447; Wed, 1 Sep 2004 15:46:31 -0500 Date: Wed, 1 Sep 2004 15:46:28 -0500 (EST) From: Sam X-X-Sender: sah@athena To: Scott Long In-Reply-To: <413617A4.1030202@samsco.org> Message-ID: References: <413617A4.1030202@samsco.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed cc: freebsd-arch@freebsd.org Subject: Re: disk_create and cdevsw_add X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 01 Sep 2004 19:46:38 -0000 On Wed, 1 Sep 2004, Scott Long wrote: > Sam wrote: >> 'lo again, >> >> kern/subr_disk.c:/^disk_create/ takes >> two cdevsw types, and I only vaguely >> understand why. Can someone explain it to me? > > I'm not really clear on this myself, other than the first cdevsw > contains your actual table, and the second one is a dummy that you > allocate but don't actually touch. > >> >> I'm generally confused about resolving >> entry points into the driver. Does a >> block device only get an open() after >> registering it with disk_create? > > Yes. disk_create() is just a modified form of cdevsw_add(), and > your cdevsw entry points are not accessable until that is called. > >> Supposing I want to set some ioctls for >> an aoecontrol utility (show all devices >> known, eg), what would aoecontrol open >> to ioctl? > > You can either implement the ioctl handler in the same device as the > AoE device, or you can create a separate control device with it's own > major and minor that represents all of the AoE devices, or you can do > both. You can also create a control device per AoE device, but that > isn't terribly common these days and has implications when porting to > 5.x and beyond. > > What kind of things will aoecontrol do? If it will be creating and > destroying AoE device instances, then you definetly want a separate > control device. You might want to look at my old 4.x RAIDFrame patches > that do this. They can be found at http://people.freebsd.org/~scottl/rf Right now it would be useful to pull out the list of devices the driver knows about. In lunix I have a char driver implementing a set of files, one of which is stat: % cat /dev/etherd/stat /dev/etherd/e0.0 up /dev/etherd/e0.1 up ... So I'm thinking that here I'll set up an ioctl so aoecontrol could spit out such information. Another idea is to permit a way to restrict the interfaces acceptable to do AoE on. Currently I broadcast on every interface to find devices I can talk to; I can imagine a sysadmin might find this undesirable. The former goes away with 5.x because if it's known, it's in /dev. The latter could be enforced by making the user recompile the module, a nightmare for non-coders. A further ponderance: Each AoE device has a major,minor address. Let's call them aoemajor/aoeminor to be clear. I have a simple association between unit and aoemajor/aoeminor using a MAJPERMIN constant: unit = aoemajor * MAJPERMIN + aoeminor; This permits me to create device nodes that abstract the network. As a lunix example, /dev/etherd/e0.0 is the AoE device with aoemajor=0, aoeminor=0. In coraid's implementation, aoemajor is a shelf id and aoeminor is a slot id. So this example uses the blade in shelf 0, slot 0. The AoE protocol permits specifying the aoemajor,aoeminor address in the frame. It's possible to send out an ethernet broadcast specifying a particular aoemajor,aoeminor and have only the blade with that aoemajor,aoeminor process it. So: if I could get an open on a device I could send out a frame to see if the device is there at open time. Due to the current scheme I have to periodically send out a broadcast (with aoemajor and aoeminor unspecified) to probe the network. In a setup with a large number of blades, avoiding a periodic storm would be desirable. Any thoughts on this? Sam