From owner-freebsd-arch Thu Apr 6 21:45:54 2000 Delivered-To: freebsd-arch@freebsd.org Received: from ns1.yes.no (ns1.yes.no [195.204.136.10]) by hub.freebsd.org (Postfix) with ESMTP id ED1A137BD2D for ; Thu, 6 Apr 2000 21:45:46 -0700 (PDT) (envelope-from eivind@bitbox.follo.net) Received: from bitbox.follo.net (bitbox.follo.net [195.204.143.218]) by ns1.yes.no (8.9.3/8.9.3) with ESMTP id GAA16541 for ; Fri, 7 Apr 2000 06:49:21 +0200 (CEST) Received: (from eivind@localhost) by bitbox.follo.net (8.8.8/8.8.6) id GAA35190 for freebsd-arch@freebsd.org; Fri, 7 Apr 2000 06:45:44 +0200 (CEST) Received: from prism.flugsvamp.com (cb58709-a.mdsn1.wi.home.com [24.17.241.9]) by hub.freebsd.org (Postfix) with ESMTP id 6519337BD2D for ; Thu, 6 Apr 2000 21:45:37 -0700 (PDT) (envelope-from jlemon@flugsvamp.com) Received: (from jlemon@localhost) by prism.flugsvamp.com (8.9.3/8.9.3) id XAA43979; Thu, 6 Apr 2000 23:49:05 -0500 (CDT) (envelope-from jlemon) Date: Thu, 6 Apr 2000 23:49:05 -0500 From: Jonathan Lemon To: Matthew Dillon Cc: Jonathan Lemon , Archie Cobbs , freebsd-arch@freebsd.org Subject: Re: RFC: kqueue API and rough code Message-ID: <20000406234905.K80578@prism.flugsvamp.com> References: <200004070107.SAA97591@bubba.whistle.com> <200004070220.TAA92896@apollo.backplane.com> <20000406220454.J80578@prism.flugsvamp.com> <200004070401.VAA93492@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0pre2i In-Reply-To: <200004070401.VAA93492@apollo.backplane.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.ORG On Thu, Apr 06, 2000 at 09:01:39PM -0700, Matthew Dillon wrote: > > Hey! (I snap my fingers), I got it! > > There is no need to copy anything back and forth from user space to > kernel space! > > It's simple, really. How about this API: > > qfd = kqueue(); > kqueuectl(qfd, filter, fd, struct event *ev); > ev = kevent(qfd, timeout); > > or (another way to do kevent): > > n = kevent(qfd, struct event **ary, int nmax, struct timeval *timeout); > > The key thing here is that the kernel creates its own internal data > structure which has the descriptor, filter operation, and a pointer > to the *USER* event structure. The kernel would not otherwise copy > the user event structure into kernel space nor would it copy it back > to return the event. Exactly. Now, this is just what the code does at the moment, only slightly differently. In the scheme above, every "registration", via kqueuectl, copies in (filter, fd, ev), and then the kernel additionally copies in (data, flags), as those are also input parameters. So far, this turns out to be one more element than the current structure. When returning data, the kernel does a copyout of (data, flags) to the saved event structure, and then copies out the pointer to the structure. This saves one copy over what I have now. In total, this comes out equal to just copying (data,flags,ident,filter). The only difference that I can see is that with the scheme above, the user-level code must keep the data in the same location, which may not be ideal for some applications. It also binds the implementation between user and kernel a little tighter than I would like. Looking at it another way, I use (event/filter) as a capability descriptor to user space rather than a pointer. It seems that if you simply had (void *udata) field to the kevent structure, then it would be easy for you to implement your method above. Then specific filters (which understand the layout of the structure that *uevent points to) could be written access to extended data: struct user_event { int priority; void (*dispatch)(struct event *ev); int thread; struct kevent { int fd; short filter; short flags; long data; void *uevent; } } How about that? The kernel only cares about `struct kevent'. It won't touch the 'uevent' pointer at all. In theory, you could use the `data' field as a pointer, but for uniformity I'd rather just add one more field. -- Jonathan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message