Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 13 Mar 2003 14:00:49 -0500
From:      David Cuthbert <dacut@kanga.org>
To:        hackers@freebsd.org
Subject:   Re: first parameter to select
Message-ID:  <3E70D561.1080001@kanga.org>
In-Reply-To: <20030313083710.GA8225@cirb503493.alcatel.com.au>
References:  <3E702BCC.3030208@kanga.org> <20030313083710.GA8225@cirb503493.alcatel.com.au>

next in thread | previous in thread | raw e-mail | index | archive | help
Peter Jeremy wrote:
> On Thu, Mar 13, 2003 at 01:57:16AM -0500, David Cuthbert wrote:
>>To be honest, I've never passed anything but FD_SETSIZE for this 
>>parameter.  When I'm writing a performance critical server, I use poll() 
>>instead.  It's faster
> 
> This is an interesting claim.  Do you have some pointers to back it up?
> It would seem to be rather unreasonable to claim that poll() is faster
> when (by your own admission) you've never used select() efficiently.
> I could equally say that I always pass getdtablesize() as the second
> argument of poll() and if I'm writing a performance-critical server,
> I use select() instead - it's faster.

Admittedly, my experience is dated and refers to implementing servers on 
Solaris.  The man page for select(3c) on Solaris explicitly states:

"The poll(2) function is preferred over this function.  It must be used 
when the number of descriptors exceeds FD_SETSIZE."

I'll profess my ignorance: I have no idea how well this maps to FreeBSD. 
  On Solaris (well, with whatever combination of patchlevels we and our 
customers were running), you could time the difference with a stopwatch.

Though we did have some combinations of patchlevels where I'd swear it 
was faster if we communicated by telegraph. :-)

(Just now seeing the benchmark posted... hm... it's possible that I'm 
saying poll() and we ended up using /dev/poll...)

> In virtually all cases, poll() will need to copy more data in and out
> of the kernel then select() would.  Likewise, in virtually all cases,
> select() will need to scan more file descriptors than poll() does.
> The overall performance comes down to the relative cost of copying
> data vs testing bits.

Not sure what you mean by virtually all cases.

Given that a poll() descriptor is 12 bytes and fd_set is usually at 
least 128 bytes (does select() copy the entire fd_set?  I believe this 
is the case, but don't have access to the source atm), the savings kicks 
in at 12 descriptors.

We usually had no more than 6 connections (these were compute servers, 
not web servers), so YMMV.  At any rate, I'd argue that the time 
occupied by copying/scanning/setting is far overshadowed by the time 
spent in I/O.

> select() is the older mechanism, developed for BSD.  poll() is the
> SystemV equivalent - and is therefore blessed by SVID etc.  (My guess
> is that the API difference is more due to NIH than any technical
> justification).

Heh.  Yeah, there's definitely a case of "we know X better and thus 
focused all our optimisation efforts on X" in operation here.

I'd still argue that, when porting a program across platforms, the 
behaviour of select() is likely to be more consistent than poll(), 
usually (always?) in the cases where something unusual has occurred on a 
descriptor.


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-hackers" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?3E70D561.1080001>