From owner-freebsd-hackers Fri Jan 27 09:32:53 1995 Return-Path: hackers-owner Received: (from root@localhost) by freefall.cdrom.com (8.6.9/8.6.6) id JAA23067 for hackers-outgoing; Fri, 27 Jan 1995 09:32:53 -0800 Received: from cs.weber.edu (cs.weber.edu [137.190.16.16]) by freefall.cdrom.com (8.6.9/8.6.6) with SMTP id JAA23059 for ; Fri, 27 Jan 1995 09:32:51 -0800 Received: by cs.weber.edu (4.1/SMI-4.1.1) id AA29338; Fri, 27 Jan 95 10:26:53 MST From: terry@cs.weber.edu (Terry Lambert) Message-Id: <9501271726.AA29338@cs.weber.edu> Subject: Re: Help with SCSI development (fwd) To: dufault@hda.com (Peter Dufault) Date: Fri, 27 Jan 95 10:26:52 MST Cc: freebsd-hackers@FreeBSD.org, vernick@cs.sunysb.edu In-Reply-To: <199501271539.KAA08510@hda.com> from "Peter Dufault" at Jan 27, 95 10:39:51 am X-Mailer: ELM [version 2.4dev PL52] Sender: hackers-owner@FreeBSD.org Precedence: bulk > Michael Vernick writes: [ ... disk striping project ... ] [ ... async I/O ... ] > > Are there any direct calls to the scsi driver from user space? I saw > > something about user calls in scsi_ioctl.c but couldn't figure out how > > to access it. If there are user level calls, are they asynchronous? There are user space SCSI calls. Check current, per Peter's recommendation. The read calls in the kernel are synchronus. The write calls are either sync or async, depending on parameters. For the most part, the async nature of the calls is *not* exposed to the user space process. Generally, this is held in the kernel as a preemption point, and the sync writes are used simply to ensure writes of file system meta-data. Both types of sync operations cause the process to be suspended while they are serviced. Many user space threading mechanisms depend on async read and write operations to provide preemption points for user space tread scheduling algorithms. The specific calls dealing with async I/O are aioread, aiowrite, aiowait, and aiocancel. This is for the SunOS 4.x LWP system and for the SVR4 and SunOS 5.x N->M mapping of user to kernel threads, where N > M. In addition, there are primitives required for stack switching and register set saving, as well as pipeline synchronization. SunOS and SVR4 provide system calls to accomplish these tasks. These facilities are not generally implemented in *BSD at this time. You may want to look into the port of the pthreads library. This will allow you to get started on a user space implementation. Providing asynchronus reads within as single thread of control within the kernel is unlikely to be portable to other than SCSI devices, and will probably require a great deal of work on the SCSI devices themselves. If you indeed plan to move this into the kernel at a later date, you may wish to consider implementing or waiting for kernel threading. Otherwise you may see speed penalties for your striping. In effect, the NFS nfsd and biod processes, which make system calls and never return, as well as the swapper and update daemons, are in effect kernel threads, albiet threads with a proc structure attached to them for no good reason. A kernel threaded implementation would not *require* a divorce of this type, but is likely to suffer greatly increased complexity if this does not occur. Kernel thread spawning is perhaps the single easiest way of forcing sync operations to act s if they were async. In simplest parlance, a kernel thread is nothing more than the kernel stack, registers, and program counter of a process. Most of the work for supporting this is, or will be, talking place for medium and fine grained SMP support, which is directly analgous to kernel multithreading, or requires it (respectively; coarse grained SMP does not require that more than one processer be allowed to execute kernel code at a time, only that there is anonymity as to which processer is actually doing the kernel code execution). I would suggest a user space implementation at this time, with an eye to the coding techniques used to do as little to prevent moving into the kernel in the future as you can. > > What I'd like to do is issue a bunch of reads (1 for each disk) to the > > controller all at once and then wait for the data to be returned. If I > > can't make asynchronous calls in user space, I guess that I could have > > multiple processes, one for each disk, where each process sends a > > synchronous read to the driver, and once all of the processes finish, > > accumulate the data. For a user space implementation, the pthreads library is an ideal way to achieve this. > > Eventually, I want to move this into the kernel. When it becomes a > > kernel level process, at what level do I issue calls. At the > > physio/bread level or lower? This is a connundrum. The problem is presented because the "slice" abstraction and therefore the per-slice device abstraction is embedded in the SCSI driver itself. This is probably a mistake, if what you want to do approaches RAID I or the Zebra FS striping techniques (I would strongly advise *against* the Zebra approach for performance reasons). This bottlenecks async reads, especially, through the device/slice abstraction. Really, you want a cleanener logical seperation of layers, and to stick at least part of your code between the layers. If you do this, please ensure that your layering is clean, so that you can impose logical device management between your layer and the underlying driver code. AIX style logical volume management has long been a goal of at least several people on these lists, myself included. Terry Lambert terry@cs.weber.edu --- Any opinions in this posting are my own and not those of my present or previous employers.