From owner-freebsd-current Mon Oct 16 11:37:33 1995 Return-Path: owner-current Received: (from root@localhost) by freefall.freebsd.org (8.6.12/8.6.6) id LAA16880 for current-outgoing; Mon, 16 Oct 1995 11:37:33 -0700 Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.6.12/8.6.6) with ESMTP id LAA16865 ; Mon, 16 Oct 1995 11:37:25 -0700 Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id LAA25019; Mon, 16 Oct 1995 11:31:03 -0700 From: Terry Lambert Message-Id: <199510161831.LAA25019@phaeton.artisoft.com> Subject: Re: getdtablesize() broken? To: bakul@netcom.com (Bakul Shah) Date: Mon, 16 Oct 1995 11:31:02 -0700 (MST) Cc: terry@lambert.org, bde@zeta.org.au, hackers@freefall.freebsd.org, rdm@ic.net, current@freefall.freebsd.org In-Reply-To: <199510150649.XAA15664@netcom15.netcom.com> from "Bakul Shah" at Oct 14, 95 11:49:24 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Length: 3582 Sender: owner-current@FreeBSD.org Precedence: bulk > > The correct limit on the largest number is FD_SETSIZE, as defined in > > sys/types.h. > > IMHO limiting the fdset bitarray size like this *within* the > kernel is a mistake. I have an application where I run into > this and am forced to use a multi process solution. Imagine > a server handling > FD_SETSIZE (i.e. 256) TCP connections > to clients -- requests are not all that frequent and each > takes just a little bit of time to service so they *can* all > be handled by one process easily. A multi process solution > gets complicated (need to put shared state in shared memory, > use locking etc.) and slower (extra contex switches, > lock/unlock time). > > Using a limit of FD_SETSIZE does not buy you extra > protection or anything. RLIMIT_NOFILE is the right limit to > check against in kern/sys_generic.c:select(). Mercifully > this limit is changeable via sysctl so server machines can > up it. NetBSD, FreeBSD, Linux and may be even bsdi (I > haven't checked recently) are all guilty here. Small upper > limits is another thing that separates PeeCees from serious > server machines. > > Let me say this another way. If I can create N files, I > should damn well be able to select() on any one of them. You should damn well be able to use fopen() instead of open() to get at them as well. Checked out the stdio libraries on every OS from FreeBSD to Solaris to VMS lately? Stdio has a hard limit, generally by use of a character to store the fd. The problem is the same as the "unlimited open FD's" problem: stupid programs like BASH that are too lazy to "chase their tail" on descriptor assignments are going to go to the top end and "work down" on the assumption that they can split the FD domain into two name space. Using FD_SETSIZE or using RLIMIT_NOFILE, or getting the current open descriptor limit from the kernel (when the application has an explicit bitvector element limitation set at compile time, not because of a check for FD_SETSIZE by because declaration of a bit vector uses the FD_SETSIZE to declare the vector length) is equally stupid and equally succeptible to screwups. The "correct" way is to get rid of the interfaces which are succeptible to bad programming style. The user *SHOULD* be using the highest open FD in the set of FD's being selected upon, *NOT* some arbitrary constant. Select's FD count should *NEVER* be coded as a constant. Poll, anyone? Poll is inferior to select, both because of the 10ms limit on timeout resoloution and because select is often used with no descriptors to get a timeout and poll can't be used in this mode. Arguably one would want to use this type of timeout mechanism in place of interval timers to get both interval timers and an I/O timeout at the same time. The correct thing to do for BASH is to push the open FD up by 10 or so each time a collision occurs and to maintain an internal mapping table so that the FD used for the script being executed moves out of the way of collisions cause be explicit descript references in the script itself. Similarly, the correct way to use select is to use the highest open descriptor as the highest open desciptor instead of some stupid arbitrary value (even if it does come from the kerne) and to min(FD_SETSIZE) and that value because if you don't, you are going to run off the end of the bit vector and SIGSEGV (best case) or if into mapped memory, you will potentially destroy data (worst case). Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers.