From owner-freebsd-chat Wed Oct 25 11:35:28 2000 Delivered-To: freebsd-chat@freebsd.org Received: from peace.netnation.com (peace.netnation.com [204.174.223.2]) by hub.freebsd.org (Postfix) with ESMTP id 8F35037B479 for ; Wed, 25 Oct 2000 11:35:25 -0700 (PDT) Received: from sim by peace.netnation.com with local (Exim 3.13 #5) id 13oVOS-0003Id-00; Wed, 25 Oct 2000 11:35:20 -0700 Date: Wed, 25 Oct 2000 11:35:20 -0700 From: Simon Kirby To: Jamie Lokier Cc: Jonathan Lemon , Dan Kegel , chat@freebsd.org, linux-kernel@vger.kernel.org Subject: Re: kqueue microbenchmark results Message-ID: <20001025113520.E12064@stormix.com> References: <20001024225637.A54554@prism.flugsvamp.com> <39F6655A.353FD236@alumni.caltech.edu> <20001025010246.B57913@prism.flugsvamp.com> <20001025112709.A1500@stormix.com> <20001025190848.C2266@pcep-jamie.cern.ch> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Mailer: Mutt 1.0i In-Reply-To: <20001025190848.C2266@pcep-jamie.cern.ch>; from lk@tantalophile.demon.co.uk on Wed, Oct 25, 2000 at 07:08:48PM +0200 Sender: owner-freebsd-chat@FreeBSD.ORG Precedence: bulk X-Loop: FreeBSD.org On Wed, Oct 25, 2000 at 07:08:48PM +0200, Jamie Lokier wrote: > Simon Kirby wrote: > > > What applications would do better by postponing some of the reading? > > I can't think of any reason off the top of my head why an application > > wouldn't want to read everything it can. > > Pipelined server. > > 1. Wait for event. > 2. Read block > 3. If EAGAIN, goto 1. > 4. If next request in block is incomplete, goto 2. > 5. Process next request in block. > 6. Write response. > 7. If EAGAIN, wait until output is ready for writing then goto 6. > 8. Goto 1 or 2, your choice. > (Here I'd go to 2 if the last read was complete -- it avoids a > redundant call to poll()). > > If you simply read everything you can at step 2, you'll run out of > memory the moment someone sends you 100000 requests. > > This doesn't happen if you leave unread data in kernel space -- > TCP windows and all that. Hmm, I don't understand. What happens at "wait until output is ready for writing then goto 6"? You mean you would stop the main loop to wait for a single client to unclog? Wouldn't you just do this? -> 1. Wait for event (read and write queued). Event occurs: Incoming data available. 2. Read a block. 3. Process block just read: Does it contain a full request? If not, queue, goto 2, munge together. If no more data, queue beginning of request, if any, and goto 1. 4. Walk over available requests in block just read. Process. 5. Attempt to write response, if any. 6. Attempted write: Did it all get out? If not, queue waiting writable data and goto 1 to wait for a write event. 7. Goto 2. Assume we got write clogged. Some loop later: 10. Wait for event (read and write queued). Event occurs: Write space available. 11. Write remaining available data. 12. Attempted write: Did it all get out? If not, queue remaining writable data and goto 1 to wait for another write event. 13. Goto 2. (If we're some sort of forwarding daemon and the receiving end of our forward has just unclogged, we want to read any readable data we had waiting. Same with if we're just answering a request, though, as the send direction could still get clogged.) What can't you do here? What's wrong? Note that the write event will let you read any remaining queued data. If you actually stop from going back to the main loop when you're write clogged, you will pause the daemon and create an easy DoS problem. There's no way around needing to queue writable data at least. This is how I wrote my irc daemon a while back, and it works fine with select(). I can't see what wouldn't work with edge-triggered events except perhaps the write() event -- I'm not sure what would be considered "triggered", perhaps when it goes under a watermark or something. In any case, it should all still work assuming get_events() offers the ability to receive "write space available" events. You don't have to read all data if you don't want to, assuming you will get another event later that will unclog the situation (meaning the obstacle must also trigger an event when it is cleared). In fact, if you did leave the read queued in a daemon using select() before, you'd keep looping endlessly taking all CPU and never idle because there would always be read data available. You'd have to not queue the descriptor into the read set and instead stick it in the write set so that you can sleep waiting for the write set to become available, effectively ignorning any further events on the read set until the write unclogs. This sounds just like what would happen if you only got one notification (edge triggered) in the first place. Simon- [ Stormix Technologies Inc. ][ NetNation Communications Inc. ] [ sim@stormix.com ][ sim@netnation.com ] [ Opinions expressed are not necessarily those of my employers. ] To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-chat" in the body of the message