Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 15 Sep 1999 12:45:29 -0400 (EDT)
From:      Christopher Sedore <cmsedore@mailbox.syr.edu>
To:        Jayson Nordwick <nordwick@scam.xcf.berkeley.edu>
Cc:        freebsd-hackers@FreeBSD.ORG
Subject:   Re: High Performance I/O (more)
Message-ID:  <Pine.SOL.4.10.9909151235280.1425-100000@rodan.syr.edu>
In-Reply-To: <19990915070800.34512.qmail@scam.xcf.berkeley.edu>

next in thread | previous in thread | raw e-mail | index | archive | help


On Wed, 15 Sep 1999, Jayson Nordwick wrote:

> I did research this weekend on high performance I/O.  I looked at differerent
> approaches and to me they all appear the same (I know that I will get some
> flamage for this).  The two most prominent models that I saw were IO
> Completion Ports and Synchronous Events (such as the Gaurav
> http://www.cs.rice.edu/~gaurav/papers/usenix99.ps).
> 
> I think that both of these models are basically the same.  They both have
> an event queue that you pick up events from.  The only way that they differ
> is in what they call an event.  Completion ports take asynchronous opperations
> and queue an event when the opperation completes (hence the name).  Synchronous
> events do the opposite: they queue an event when an opperation is possible
> and then the synchronous (usually, non-blocking) opperation is performed.
> From this, you can decouple and event queue from what you call an event.
> 
> >From what I can see either model will give roughly the same performance,
> as they both do roughly the same amount of work.  The one benefit that seems
> to exist for the Completion Ports model is that there are fewer contex
> switches.
> 
> Now, looking at POSIX.1b signals and signal queues and getting some
> information from Stephen Tweedie it looks like completion ports are doable
> without anything new, I think that I have decided.
> 
> If you find an available signal, set the handler for it, the block it,
> this signal number now effectively becomes the completion port.  You then
> can fcntl() a file descriptor with F_SETSIG and the signal number.  Then to
> fetch the blocked signals, use sigwaitinfo().  I guess you could also use
> aio_{read,write}() and set sigevent appropriately.  This actually seems
> preferable since you can then use aio_return() to find the return value
> out and use aio_cancel() to cancel the request if wanted.
> 
> The one drawback that I see to this is that it can only really handle
> aio_{read,write}() and {read,write}()/fcntl().  Any other events such as
> thread/child deaths cannot really be worked into this scheme unless you
> could set the signal they deliver on termination.
> 
> If you really wanted to, you could have signals delivered for the ability
> to read/write to a file descriptor and then you would have Gaurav's model.
> 
> Basically, unless anybody can see anything wrong with this get to work
> implementing!

As you note, IOCPs and the event scheme suggested by Gaurav et al have
some differences related to whether they occur after an async operation
completes, or when a descriptor is ready for a particular activity.  The
differences may not seem important, but for implementing high performance
io systems, they probably do matter.  

The signal approach has some limitations, in that (correct me if I am
wrong), FreeBSD doesn't have realtime signals yet, and POSIX specifies how
signals and aio operations are to work already.

I read with interest a post a week or three back talking about some
efforts in the Linux community to come up with a standardized way of doing
this.  

My ideas for this are a little different than what I've seen proposed thus
far, more along the lines of creating something that acts as both an event
queue and a IOCP.  Ideally this would be a descriptor that could be shared
across processes (or threads), and could be accessed using read().  I
don't much care for the suggestion that threads ought to have an event
queue of their own--rather if you want a per-thread completion
notification, simply create a descriptor for each thread that needs this
function.  What ever is created, it should be sufficiently extensible to
allow for all the events we can imagine now, as well as being flexible for
future enhancement.  (FWIW, I've also been thinking that I might like to
be able to submit aio requests by write()ing said descriptor.  Just a
thought.)

I hope we'll see an update on where the Linux efforts are--I'd be
interested in joining the conversation.

-Chris



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SOL.4.10.9909151235280.1425-100000>