From owner-freebsd-hackers Mon Apr 26 5:49:45 1999 Delivered-To: freebsd-hackers@freebsd.org Received: from mailer.syr.edu (mailer.syr.edu [128.230.18.29]) by hub.freebsd.org (Postfix) with ESMTP id 61F6E14E30 for ; Mon, 26 Apr 1999 05:49:34 -0700 (PDT) (envelope-from cmsedore@mailbox.syr.edu) Received: from rodan.syr.edu by mailer.syr.edu (LSMTP for Windows NT v1.1a) with SMTP id <0.ADC26190@mailer.syr.edu>; Mon, 26 Apr 1999 8:49:22 -0400 Received: from localhost (cmsedore@localhost) by rodan.syr.edu (8.8.7/8.8.7) with SMTP id IAA00927 for ; Mon, 26 Apr 1999 08:49:15 -0400 (EDT) X-Authentication-Warning: rodan.syr.edu: cmsedore owned process doing -bs Date: Mon, 26 Apr 1999 08:49:15 -0400 (EDT) From: Christopher Sedore X-Sender: cmsedore@rodan.syr.edu Reply-To: Christopher Sedore To: hackers@freebsd.org Subject: aio and sockets (long) Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-hackers@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG I've been working on modifying the kernel aio routines so that they are more useful for sockets. Currently, if you ask for an async read or write on a socket, this takes up one aiod which blocks waiting for the operation to complete, which is undesireable. What I've currently implemented (and am not happy with) is a alternate queueing for socket operations. Basically, if the descriptor is DTYPE_SOCKET, we check to see if it is readable or writeable (soreadable()/sowriteable()) and if it is we queue as before. If it is not, then the aiocb is put on a socket queue, and the socket is modified to call a wakeup routine with a pointer to the aiocb. I modified the aiocb to contain another pointer to create a singly linked list of aiocbs pending on a socket. When the wakeup routine is called, all the aiocbs that are waiting on the socket (and are of the same read or write type) are moved to the aio job queue. This worked really well until I hit control-c and paniced the system :) I had missed aio_proc_rundown, which cleans up the outstanding aio requests before process exists. I fixed this by a fair bit of frobbing around in the aio_proc_rundown (find the socket, work through the queued aiocbs, and remove the ones that are for the proc that is going away). I'm now getting system hangs instead of panics, but I'm betting that's a problem with my code since I think the concepts are sound. Here's what I don't like: 1. It seems silly to requeue socket read operations back to the main job queue on an upcall--why not simply do the read in the upcall and be done with it? 1a. Likewise for writes, but I'd much prefer the whole write to be completed in one call, and we'd have to do a bit more messing around to ensure this (like checking available buffer space, etc). 2. The linked list stuff for the socket queued aiocbs is really ugly. The head of the list is the so_upcallarg element in the socket struct. This linked list can include operations from multiple processes, and one can't use the linked list macros since there's no place to have the head end. Here are a few things I don't understand, but live with: 1. I don't see what protects the async code from having the file descriptor closed underneath it. It seems that it is checked when the operation is queued, but not afterward. 2. We always call splx(s) _after_ tsleep(), which seems wierd to someone who is used to userland multithreaded programming. (so I'm no kernel expert) (I also realized that my brain was AWOL when I commented previously on what I thought might be a memory leak in the aio routines.) Here's what I think I'd like to do: 1. Add a couple of tailqs to the socket structure, one to hold async read requests, one for async write requests. Arguably, a single one should be sufficient, though it requires stepping through the list to find one with a relevant operation. 2. Fix the socket close routines to dispose of the aiocbs properly. Fix aio_proc_rundown to handle this scenario. 3. Fix the wakeup routine to execute reads in the wakeup, rather than requeueing them. Only the number of reads necessary to empty the buffer should be executed (or all in the case of an error). 4. Leave writes alone for now, by just requeueing them as I currently do. This could also present a solution for the pid vs struct proc * problem in the "flock + kernel threads bug" series. Select operations could be queued as async requests on the socket--they would then get killed by aio_proc_rundown (with proper glue). Same actually goes for all the wakeup functions. Just a thought. Any comments or enlightenment would be appreciated. -Chris To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-hackers" in the body of the message