Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 28 May 1996 21:15:44 EST
From:      "Kaleb S. KEITHLEY" <kaleb@x.org>
To:        Terry Lambert <terry@lambert.org>
Cc:        hackers@freefall.freebsd.org
Subject:   Re: Forgiving select() call. 
Message-ID:  <199605290115.VAA20117@exalt.x.org>
In-Reply-To: Your message of Tue, 28 May 1996 15:36:02 EST. <199605282236.PAA12003@phaeton.artisoft.com> 

next in thread | previous in thread | raw e-mail | index | archive | help

> > > Trace it with the SunOS 4.1.3 trace instead.  It will say "select".
> > 
> > Sorry Charlie. No workie. If it ever did. A nm dump on /usr/4lib/libc.so.1.8
> > says select is UNDEF. ldd on the same file shows a dependency on 
> > /usr/lib/libc.so.1. As far as I can tell, and as far as the native mode
> > truss shows, it all comes down to poll. Ain't no select(2) system call
> > in Solaris.
> > 
> > > 
> > > I suspect that there are different systent[] tables in use, just
> > > like the ABI support in FreeBSD uses.
> > 
> > 
> > Anecdotal evidence is no evidence.
> 
> Trace it on a SunOS 4.1.3 *box*.


Obviously it will say select on a 4.1.3 box. So what? The question
that has led us down this primrose path is whether Solaris has select.


> Note "#define SYS_select 93" in /usr/include/syscall.h on your 4.1.3 box.
> Note "#define SYS_poll 153" in /usr/include/syscall.h on your 4.1.3 box.
> 
> Note "man select" returning "select(2)".
> Note "man poll" returning "poll(2)".
> 
> Note "man 3 select" returning "not found".


Yeah, yeah, yeah. Tell me something I don't know.


> Select in 4.1.3 is a system call.


No duh. Hey everyone, select in 4.1.3 is a system call.


> A statically linked 4.1.3 binary will trap the select(2) stub through
> trap entry 93.


What evidence do you have that it's even trapping? Everything could be 
handled in the library and truss could be telling the truth -- that it's 
using poll. We do know how truss works. It traps, and inspects the
processor state to see what triggered the trap.


> A 4.1.3 binary calling "syscall(SYS_select, nfds, rfds, wfds, xfds, tvp)"
> will trap through entry 93.
> 
> 
> A Solaris "5.x" binary calling through the manifest address for SYS_select
> via "syscall(93, nfds, rfds, wfds, xfds, tvp)" will trap through entry 93.


Well, if it were calling through entry 93, and it printed fchmod (which
is what syscall 93 is on Solaris) instead of poll, then I might believe
you that truss is confused.


> That's the way it is.
> 
> If truss lies to you... too bad.


Umh, yeah, right. I'm from Missouri. Truss says it's poll. Black. White.
Black. White. I see where this is going.


> > What is it you think I'm suggesting? All I've said is that poll(2) (and
> > polltv(2)) have their merits. At no time have I suggested that select(2)
> > should be replaced with a select(3) that's implemented with poll(2).
> 
> Poll's merits are on the basis of an improper implementation of select.


Well, that's the first time *that* has come up. Pray tell, now what's
wrong with select.


> > > The set size limitations are a result of the FD_SETSIZE, an advisory
> > > value, being used as a copyin limit in kernel space.
> > > 
> > > In point of fact, the limit is derivable from the nfds argument to
> > > select.  
> > 
> > As long as it's <= FD_SETSIZE.
> 
> Bunk.  If the kernel uses FD_SETSIZE at all, it's broken.  Poll has
> limits on the number of fd's as well; they are the same limits as
> select imposes (signed int).
> 
> 
> > > Use a -current kernel instead of a 2.1.0R kernel and I
> > > believe this has been fixed for you.
> > 
> > Yeah, well, I bet there's a lot of things the release-du-hour kernel
> > does. I already have one system I seem to compile on an hourly basis.
> > I don't need another.
> 
> Well, that's the fix.  It's not like 2.1R would suddenly grow a poll()
> call if you fixed it -- it would go into -current, if it went in, and
> you'd still have your "release-du-hour" problem staring you in the
> face.


Sigh. If I implement poll on my 2.1R box, I have it in 2.1R. If I 
finish it and contribute it (which is my intention) it'll be integrated
into -current if someone on the core team decides to adopt it. But I
don't have to get -current to have it. But we digress.


> > C'mon Terry, what are you saying? It has to be directly addressable
> > in order for copyin/copyout to work. Half the 2.1R kernel still uses
> > memcpy or bcopy instead of copyin/copyout. McKusick's new book explains 
> > how this works, quite eloquently I think.
> 
> ???
> 
> I don't see this... unless the area is mmapped, it doesn't have a
> kernel PTD.


Lessee. The process is making a system call. The process is passing
parameters. I dunno Terry, seems to me like the area has to be mapped. 
Copyin/copyout looked pretty simple to me. The explanation in the 4.4BSD 
book seemed pretty simple too. Do you have a different explanation? But
we're still digressing. Passing parameters in a syscall isn't rocket
science, and it's hardly the topic at hand.


> > > The only
> > > overhead it saves is the mask reset overhead in user space, which
> > > it trades for bit clearing overhead in kernel space.  Big deal.
> > 
> > It turns into a big deal if I want to select for read, write, and
> > except on file descriptor 254. That's a lot of copying for one file
> > descriptor. Imagine what it'd be like if FD_SETSIZE was 1024 or 4096
> > like it is on a lot of systems. And using poll I don't have to do *any* 
> > of the bit twiddling in user space or in kernel space.
> 
> I can make the same argument about traversing the array; the original
> complaint about ~250 was an IRC server that was running over 256
> entries and using an old kernel before the FD_SETSIZE was ripped out
> of the kernel space.
> 
> For 250 fd's, traversing a 250 entry array in kernel space in order
> to do the job is at least as annoying.

Annoying isn't the adjective I would use to describe the instructions 
necessary to bit-twiddle 250 file descriptors out of one or more fd_sets. 
Extracting just the part of the code that does this from 2.1R sys_generic.c, 
and inlining the ffs call results in nearly 70 instructions. Okay, so
maybe it's not fair to inline. Without the inline it's 60 instructions
and the overhead of the function call. And it doesn't really matter whether
there's two fds or 2000 fds, it's the same 70 instructions.

The equivalent code to extract the fds from an array needs five instructions. 
You figure it out.


> For something only doing one or two descriptors, it doesn't *need*
> the minimalized overhead that something that has to be responsive
> to 250 clients needs.


Are you making assumptions about how many of these two-file-descriptor
programs are running?

--

Kaleb KEITHLEY



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199605290115.VAA20117>