Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 16 Sep 1999 07:33:41 -0400 (EDT)
From:      Christopher Sedore <cmsedore@mailbox.syr.edu>
To:        Alfred Perlstein <bright@wintelcom.net>
Cc:        Jayson Nordwick <nordwick@scam.xcf.berkeley.edu>, freebsd-hackers@FreeBSD.ORG
Subject:   Re: High Performance I/O (more)
Message-ID:  <Pine.SOL.4.10.9909160720220.1042-100000@rodan.syr.edu>
In-Reply-To: <Pine.BSF.4.05.9909151621520.6392-100000@fw.wintelcom.net>

next in thread | previous in thread | raw e-mail | index | archive | help


On Wed, 15 Sep 1999, Alfred Perlstein wrote:

> 
> On Wed, 15 Sep 1999, Christopher Sedore wrote:
> 
> > My ideas for this are a little different than what I've seen proposed thus
> > far, more along the lines of creating something that acts as both an event
> > queue and a IOCP.  Ideally this would be a descriptor that could be shared
> > across processes (or threads), and could be accessed using read().  I
> > don't much care for the suggestion that threads ought to have an event
> > queue of their own--rather if you want a per-thread completion
> > notification, simply create a descriptor for each thread that needs this
> > function.  What ever is created, it should be sufficiently extensible to
> > allow for all the events we can imagine now, as well as being flexible for
> > future enhancement.  (FWIW, I've also been thinking that I might like to
> > be able to submit aio requests by write()ing said descriptor.  Just a
> > thought.)
> 
> I thought it'd be very useful to be able to give the kernel a pointer
> to a pollfd struct in your (userland) address space, when events occur
> SIGIO (or maybe some other signal?) is posted to the process after
> updating the process's pollfd.
> 
> This makes queing not nessesary because if a signal is 'lost' somehow
> the pollfd is still updated, it also reduces the amount of syscalls
> needed by at least one.

Unfortunately, I think queueing would still be necessary since you would
not be able to update the userland pollfd until that proc became curproc
again.  Unless you mapped the pollfd space into the kernel and wired it.

This still has the problem of scanning all the pollfd structs.  For up to
a the low hundreds, this probably is OK, somewhere after that it becomes
expensive.  The nice thing about the event queue is that it scales
(virtually) linearly to any number of descriptors.

> >From what I understand your methodology of async IO involves this
> type of cycle:
> 
> syscall_register_descriptor -> SIGIO -> \
>  syscall_read_from_event_queue -> syscall_read/write/lock/etc
> 
> Being able to register the pollfd allows for a minimal amount
> of memory overhead and one less syscall, infact it doesn't even
> require much of a new subsystem:
> 
> syscall_register_pollfd -> SIGIO -> kernel to userland write \
>   userland_pollscan -> syscal_read/write/lock/etc
> 
> You would have to have a refresh the kernel pollfd array if
> what you were interested in changed.
> 
> How does this sound to you?

I would propose a system more like this:

syscall_register_fdinterest->[eventloop:
syscall_wait_for_events/syscall_read()/syscall_write()/syscall_accept()/etc]

I don't see any particular advantage to using SIGIO or signals in general.

The nice thing is that the same loop works for async io (aio_*) if done
right--somethine like having wait_for_events returning a union with a key
value so you know what kind of event you've picked up and how to proceed.
In other words, you could freely intermix aio with event driven io in the
same loop.  This would be handy in some of the code I've been messing
with--right now I depend on an implementation detail to allow me to do an
aio_read() on a listening socket.  When said read completes (always with
failure, of course), I know that the socket is 'ready' for an accept().
Events would let me do the same without the ugliness.

Events could also (I haven't thought this out, so please forgive me if
there's an obvious bugaboo here) return additional information about the
descriptor/whatever.  One nice possibility would be outgoing buffer space
on sockets.  This may or may not be worth the coding effort.  

If implemented in a multiprocessor aware manner, events could also allow
for more parallelism that we have now.  

-Chris



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?Pine.SOL.4.10.9909160720220.1042-100000>