FreeBSD Mail Archives

Date:      Wed, 22 Dec 1999 16:09:55 -0800 (PST)
From:      Matthew Dillon <dillon@apollo.backplane.com>
To:        Jason Nordwick <nordwick@xcf.berkeley.edu>
Cc:        Jonathan Lemon <jlemon@americantv.com>, cmsedore@maxwell.syr.edu, hackers@FreeBSD.ORG
Subject:   Re: Practical limit for number of TCP connections? 
Message-ID:  <199912230009.QAA11643@apollo.backplane.com>
References:   <19991222234853.5770.qmail@scam.xcf.berkeley.edu>

:>>it worked basically the same way on the kernel end.
:>
:>Yes, it is better.  Select uses the same backend as poll(), but those
:>"bit flag manipulations" that you are talking about consume a measurable
:>amount of CPU time when you start throwing thousands of descriptors at it.
:>--
:>Jonathan
:>
:
:I cannot believe that this is true.  It would seem, to me, that it is highly
:dependant on the density of the interest bit vector of select().  If it is
:very dense, then select is more appropriate.  Obviously, if you are only
:interested in fd# 5 and fd# 1000, then, yes, there is unecessary scanning.
:However is you are interested in 90% of the descriptors from 5 to 1000, then
:all the scanning is necessary.  (you can also, obviously, look through 32
:descriptors at a time).
:
:Also, doesn't the kernel need to copy the argument arrays into kernel space.
:A struct pollfd is 8 bytes per interested descriptor, whereas, select() only
:needs 1 bit per interested descriptor.  If I am interested in 1000 descriptors
:(assuming in a row), that is 8000 bytes to copyin, versus 125 bytes.
:
:Can somebody who knows better explain, please?
:
:-jason

    It's a real problem, actually.  I'll give you an example:  Generally 
    speaking every time you do a select() call with a thousand descriptors,
    the kernel must test and setup its wait on a thousand descriptors even
    if only one (or none) has pending data.

    The moment any single descriptor has data the entire select returns,
    you wind up processing the one descriptor, then reentering the select.

    The problem is that, typically, only one descriptor will have new data at
    a time (since, typically, the data coming in via your network is 
    serialized because you only have one network interface), which means that
    for light to medium loads you end up with O(N^2) of overhead (scanning 
    1000 descriptors each time one of them gets a hit).  Even though select()
    is fast and efficient by itself, the N-squaredness of the actual select
    activity results in significant (noticeable) overhead.

    Fortunately the problem is much reduced in heavy-load situations, since 
    more descriptors are likely to have data ready on them by the time your
    program gets around to calling select() again after processing the previous
    bunch.  Even so, select() is not ideal because it tends to plateu the
    cpu useage beyond a certain point - an ill-use of your available cpu.

    The typical solution to this problem is to design your program to be
    multi-fork/select - that is, you take the 1000 descriptors and you dole 
    them out to, say, 30 processes.  Each process select()s on only 33 
    descriptors, greatly reducing the order overhead of the algorithm.

    There is a second problem that programmers must be sure to avoid, and
    that is having multiple processes select()ing on the *same* descriptor
    (for example, a listen socket for accept).  Most select() implementations
    wind up waking up all the processes for every new connection (and, in
    fact, it could be argued that this is correct since you do not know what 
    any given process will do with the results from the select()).  One has
    the same problem when using multiple processes to select() on a single
    UDP socket for data.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message

Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199912230009.QAA11643>

Header And Logo

Peripheral Links

Site Navigation

Header And Logo

Peripheral Links

Search

Site Navigation