Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Jul 1999 19:36:51 +0000 (GMT)
From:      Terry Lambert <tlambert@primenet.com>
To:        unknown@riverstyx.net (Tani Hosokawa)
Cc:        davids@webmaster.com, chat@FreeBSD.ORG
Subject:   Re: Known MMAP() race conditions ... ?
Message-ID:  <199907151936.MAA02676@usr07.primenet.com>
In-Reply-To: <Pine.LNX.4.10.9907142007120.2799-100000@avarice.riverstyx.net> from "Tani Hosokawa" at Jul 14, 99 08:07:33 pm

next in thread | previous in thread | raw e-mail | index | archive | help
tani hosokawa wrote:
> 
> On Wed, 14 Jul 1999, David Schwartz wrote:
> 
> > 
> > > The current model is a hybrid thread/process model, with a number of
> > > processes each with a large number of threads in each, each thread
> > > processing one request. From what I've seen, 64 threads/process is about
> > > right.  So, in one Apache daemon, you can expect to see >1000 threads,
> > > running inside 10-20 processes.  Does that count as a large number?
> > 	Yes. And it's bad design.
> 
> I'm curious.  How would you do it?

I can't speak for David, but the process architecture I did for
the NetWare for UNIX product used multiple processes (not threads)
with a singled shared memory region for client context records,
and a shared file descriptor table.

This was chosen over threads for the standard context switch
thrashing reasons, the lack of threads support on one of our
reference platforms, the inability to autogrow the threads
stack (even though Steve Baumel put the capability into the
SVR4.2 VM system, it was not utilized by the threads people),
and, finally, the ability to do "hot engine scheduling".

This last used a streams mux to arbitrate, in LIFO order,
incoming packets to the "hottest" work-to-do-engine, on the
theory that it would be most likely, of all engines, to have
its pages in core (remember that SVR4.2 did not have a unified
VM and buffer cache, though this was equally applicable to
data pages).

Using a threads implemetnation would have resulted in each
kernel thread engaging in paging operations, generally at the
expense of other kernel threads data.

Finally, by using a shared user space context, the process
context switch overhead did not go up at all, assuming that
you had dedicated this machine as a server: the same engine
was run repeatedly, with the other engines only coming active
when there was sufficient load to merit their participation
based on I/O interleaving (this was a decision of the MUX,
which also knew how to automatically ACK -- Novell calls this
a "server busy" -- requests from a client with a request in
progress).

So... I personally would use an anonymous work-to-do engine model,
shared memory, either via a single process the N:M kernel/user
threads (M := N) or via multiple processes with shared contex
for representing client state, and asynchronous I/O for I/O
interleaving, probably using mmap'ed regions for static content
to zero-copy the writes, with a lazy discard policy utilizing LRU.

The benefit to the anonymous work-to-do engine is that you
can use platform apropriate technology to implement your
shared context region, be that a SYSV shared memory segment,
and mmap'ed file, a vfork shared process context, or global
memory in a threads based system.

Sure, it's a little more thoughtful work to get right, but our
code (compiled C) *did* outperform Native NetWare (hand coded
assembly with non-preemptive coopertive multitasking: a true
embedded system if ever there was one) on identical hardware.

8-).


					Terry Lambert
					terry@lambert.org
---
Any opinions in this posting are my own and not those of my present
or previous employers.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-chat" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199907151936.MAA02676>