Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 17 Mar 1997 22:35:23 -0800
From:      "Jordan K. Hubbard" <jkh@time.cdrom.com>
To:        hackers@freebsd.org
Subject:   Whee!  jkh adds his first syscall...
Message-ID:  <2273.858666923@time.cdrom.com>

next in thread | raw e-mail | index | archive | help
The subject alone should have the kernel hackers all running for
shelter at this point - "Aigh!  He's looking in /usr/src/sys!"
they yell.  "Somebody stop him!!" :-)

Well, OK, maybe I have to confess that I've just always wanted to see
what would be involved in adding a system call, and in particular
*this* system call in order that I might implement a long-standing
wishlist item of mine with redirection and piping (I've got the first
part of this, but not the second yet).

The system call:

     int
     dup3(int oldd, pid_t tpid, int newd)

     In dup3(), a target process ID and the value of the new
     descriptor newd is specified in the context of that process.  If
     this descriptor is currently assigned to a valid file, then it
     will be returned as a new file descriptor in the current process
     context, otherwise -1 is returned.  If the returned file
     descriptor is not needed then it should be closed.  The primary
     purpose of dup3() is to allow "splicing" of I/O in
     already-running processes.

Yes, I know many will look at the name and go "yuck!" - it's half in
jest, OK?  :)

So what's the use of it?  To use in things like shells, so that you
can do stuff like this:

# make world
<chatter chatter chatter - oh crap, I wanted this in a file>
^Z%1 Suspended
# bg > make.out
# <your shell is back - output spooling into make.out>

This also works with fg, so you can foreground and redirect stdin,
stdout and stderr at the same time just as easily.

The patches here to sh implement this extra behavior, conditionalized
on HAVE_DUP3.

Now please note: This really is just proof-of-concept material here
for 3 big reasons:

1. Thwapping over another processes's file descriptors is rude,
   and it generally confuses things like the stdio library 
   to do this.  It seems to mostly work OK in my implementation,
   but I'm sure some sort of "invalidate current buffered fd contents"
   hack would have to be added to stdio if you wanted to make it
   all work correctly (try redirecting stdin, for example, and see
   the slightly odd behavior it has now).

2. The hacks to the shell are exceedingly minimal, and don't implement
   the more complicated and useful cases like:

	fg | more
	or
	yes | fg

   Note that it should be perfectly possible, but you'd have to
   change the way the shell handles these builtins pretty substantially
   to make it work.  Making redirection happen was easy. :-)

   The changes also only cover /bin/sh, which of course almost nobody
   uses.  If we actually manage to make a useful facility out of
   this, someone would also have to beat on bash and tcsh.

3. I'm sure there is at least one blatant security hole opened by
   this mechanism, and I do *ONLY THE MOST MINIMAL* checks for 
   security.  More specifically, I compare the euids of from and to,
   refusing the dup3() if they don't match (or the current euid is 0).
   This is a very minimal test, and I don't even test for proper
   parent/child relationship in the non-root case.


So use this stuff at your own risk!  I'm mostly just releasing it for
comments at this point, to find out if I'm really just smoking crack
with this whole idea.

Patches relative to 2.2-current, thought they should work just
as well in 3.0.

I also included patches to all the "derived" files from
syscalls.master - while not strictly necessary, it saves everyone from
having to do anything more than apply this patch from the top of
/usr/src and build a new libc, new kernel and new /bin/sh.

Feedback most welcome.

					Jordan

Index: bin/sh/Makefile
===================================================================
RCS file: /home/ncvs/src/bin/sh/Makefile,v
retrieving revision 1.15
diff -u -r1.15 Makefile
--- Makefile	1996/10/25 14:49:24	1.15
+++ Makefile	1997/03/18 06:00:58
@@ -15,7 +15,7 @@
 LDADD+= -ll -ledit -ltermcap
 
 LFLAGS= -8	# 8-bit lex scanner for arithmetic
-CFLAGS+=-DSHELL -I. -I${.CURDIR}
+CFLAGS+=-DSHELL -DHAVE_DUP3 -I. -I${.CURDIR}
 # for debug:
 # CFLAGS+= -g -DDEBUG=2
 
Index: bin/sh/jobs.c
===================================================================
RCS file: /home/ncvs/src/bin/sh/jobs.c,v
retrieving revision 1.8.2.1
diff -u -r1.8.2.1 jobs.c
--- jobs.c	1997/01/12 21:58:49	1.8.2.1
+++ jobs.c	1997/03/18 06:00:45
@@ -213,11 +213,20 @@
 	struct job *jp;
 {
 	struct procstat *ps;
-	int i;
+	int i, fd;
 
 	if (jp->state == JOBDONE)
 		return;
 	INTOFF;
+#ifdef HAVE_DUP3
+	for (i = 0; i < 2; i++) {
+	    if (fd_redirected_p(i)) {
+		fd = dup3(i, jp->ps[0].pid, i);
+		if (fd != -1)
+		    close(fd);
+	    }
+	}
+#endif
 	killpg(jp->ps[0].pid, SIGCONT);
 	for (ps = jp->ps, i = jp->nprocs ; --i >= 0 ; ps++) {
 		if ((ps->status & 0377) == 0177) {
@@ -591,7 +600,7 @@
 			ignoresig(SIGINT);
 			ignoresig(SIGQUIT);
 			if ((jp == NULL || jp->nprocs == 0) &&
-			    ! fd0_redirected_p ()) {
+			    ! fd_redirected_p (0)) {
 				close(0);
 				if (open("/dev/null", O_RDONLY) != 0)
 					error("Can't open /dev/null");
@@ -602,7 +611,7 @@
 			ignoresig(SIGINT);
 			ignoresig(SIGQUIT);
 			if ((jp == NULL || jp->nprocs == 0) &&
-			    ! fd0_redirected_p ()) {
+			    ! fd_redirected_p (0)) {
 				close(0);
 				if (open("/dev/null", O_RDONLY) != 0)
 					error("Can't open /dev/null");
Index: bin/sh/redir.c
===================================================================
RCS file: /home/ncvs/src/bin/sh/redir.c,v
retrieving revision 1.5
diff -u -r1.5 redir.c
--- redir.c	1996/09/01 10:21:36	1.5
+++ redir.c	1997/03/18 05:50:46
@@ -76,11 +76,11 @@
 MKINIT struct redirtab *redirlist;
 
 /*
- * We keep track of whether or not fd0 has been redirected.  This is for
+ * We keep track of whether or not fds 0-2 have been redirected.  This is for
  * background commands, where we want to redirect fd0 to /dev/null only
- * if it hasn't already been redirected.
+ * if it hasn't already been redirected, and for fb/bg redirection to files.
 */
-int fd0_redirected = 0;
+int fd_redirected[3];
 
 STATIC void openredirect __P((union node *, char[10 ]));
 STATIC int openhere __P((union node *));
@@ -132,8 +132,8 @@
 		} else {
 			close(fd);
 		}
-                if (fd == 0)
-                        fd0_redirected++;
+                if (fd >= 0 && fd <= 2)
+                        fd_redirected[fd]++;
 		openredirect(n, memory);
 	}
 	if (memory[1])
@@ -267,8 +267,8 @@
 
 	for (i = 0 ; i < 10 ; i++) {
 		if (rp->renamed[i] != EMPTY) {
-                        if (i == 0)
-                                fd0_redirected--;
+                        if (i >= 0 && i <= 2)
+                                fd_redirected[i]--;
 			close(i);
 			if (rp->renamed[i] >= 0) {
 				copyfd(rp->renamed[i], i);
@@ -303,8 +303,11 @@
 
 /* Return true if fd 0 has already been redirected at least once.  */
 int
-fd0_redirected_p () {
-        return fd0_redirected != 0;
+fd_redirected_p (int fd) {
+	if (fd >= 0 && fd <= 2)
+	    return fd_redirected[fd] != 0;
+	else
+	    return 0;
 }
 
 /*
Index: bin/sh/redir.h
===================================================================
RCS file: /home/ncvs/src/bin/sh/redir.h,v
retrieving revision 1.3
diff -u -r1.3 redir.h
--- redir.h	1996/09/01 10:21:37	1.3
+++ redir.h	1997/03/18 05:51:43
@@ -44,7 +44,7 @@
 union node;
 void redirect __P((union node *, int));
 void popredir __P((void));
-int fd0_redirected_p __P((void));
+int fd_redirected_p __P((int));
 void clearredir __P((void)); 
 int copyfd __P((int, int));
 
Index: lib/libc/sys/Makefile.inc
===================================================================
RCS file: /home/ncvs/src/lib/libc/sys/Makefile.inc,v
retrieving revision 1.20
diff -u -r1.20 Makefile.inc
--- Makefile.inc	1996/09/20 13:55:25	1.20
+++ Makefile.inc	1997/03/17 18:13:35
@@ -14,7 +14,7 @@
 
 # modules with default implementations on all architectures:
 ASM=	accept.o access.o acct.o adjtime.o bind.o chdir.o chflags.o chmod.o \
-	chown.o chroot.o close.o connect.o dup.o dup2.o execve.o fchdir.o \
+	chown.o chroot.o close.o connect.o dup.o dup2.o dup3.o execve.o fchdir.o \
 	fchflags.o fchmod.o fchown.o fcntl.o flock.o fpathconf.o fstat.o \
 	fstatfs.o fsync.o getdirentries.o getdtablesize.o getegid.o \
 	geteuid.o getfh.o getfsstat.o getgid.o getgroups.o getitimer.o \
@@ -109,6 +109,7 @@
 
 MLINKS+=brk.2 sbrk.2
 MLINKS+=dup.2 dup2.2
+MLINKS+=dup.2 dup3.2
 MLINKS+=chdir.2 fchdir.2
 MLINKS+=chflags.2 fchflags.2
 MLINKS+=chmod.2 fchmod.2
Index: lib/libc/sys/dup.2
===================================================================
RCS file: /home/ncvs/src/lib/libc/sys/dup.2,v
retrieving revision 1.3.2.3
diff -u -r1.3.2.3 dup.2
--- dup.2	1997/03/09 22:16:51	1.3.2.3
+++ dup.2	1997/03/18 06:13:05
@@ -37,7 +37,8 @@
 .Os BSD 4
 .Sh NAME
 .Nm dup ,
-.Nm dup2
+.Nm dup2 ,
+.Nm dup3
 .Nd duplicate an existing file descriptor
 .Sh SYNOPSIS
 .Fd #include <unistd.h>
@@ -45,6 +46,8 @@
 .Fn dup "int oldd"
 .Ft int
 .Fn dup2 "int oldd" "int newd"
+.Ft int
+.Fn dup3 "int oldd" "pid_t tpid" "int newd"
 .Sh DESCRIPTION
 .Fn Dup
 duplicates an existing object descriptor and returns its value to
@@ -113,6 +116,18 @@
 is a valid descriptor, then
 .Fn dup2
 is successful, and does nothing.
+.Pp
+In 
+.Fn dup3 ,
+a target process ID and the value of the new descriptor
+.Fa newd
+is specified in the context of that process.  If this descriptor
+is currently assigned to a valid file, then it will be returned
+as a new file descriptor in the current process context, otherwise
+-1 is returned.  If the returned file descriptor is not needed then
+it should be closed.  The primary purpose of
+.Fn dup3
+is to allow "splicing" of I/O in already-running processes.
 .Sh IMPLEMENTATION NOTES
 .Pp
 In the non-threaded library
@@ -166,9 +181,10 @@
 .Va errno
 indicates the cause of the error.
 .Sh ERRORS
-.Fn Dup
-and
+.Fn Dup ,
 .Fn dup2
+and
+.Fn dup3
 fail if:
 .Bl -tag -width Er
 .It Bq Er EBADF
@@ -178,6 +194,18 @@
 is not a valid active descriptor
 .It Bq Er EMFILE
 Too many descriptors are active.
+.Pp
+.Fn dup3
+will additionally fail if:
+.Bl -tag -width Er
+.It Bq Er ESRCH
+The
+.Fa tpid
+is not found.
+.It Bq Er EPERM
+The effective uid of the current process does not match that of
+the target process.  Only the super user can modify the file descriptor
+table of processes with a different euid.
 .El
 .Sh SEE ALSO
 .Xr accept 2 ,
@@ -202,3 +230,6 @@
 .Fn dup2
 function call appeared in
 .At v7 .
+The
+.Fn dup3
+function call appeared in FreeBSD 3.0 .
Index: sys/kern/init_sysent.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/init_sysent.c,v
retrieving revision 1.36
diff -u -r1.36 init_sysent.c
--- init_sysent.c	1996/09/19 19:48:31	1.36
+++ init_sysent.c	1997/03/17 18:15:11
@@ -2,7 +2,7 @@
  * System call switch table.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from	Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp 
+ * created from	Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp 
  */
 
 #include <sys/types.h>
@@ -266,7 +266,7 @@
 	{ 3, (sy_call_t *)shmctl },			/* 229 = shmctl */
 	{ 1, (sy_call_t *)shmdt },			/* 230 = shmdt */
 	{ 3, (sy_call_t *)shmget },			/* 231 = shmget */
-	{ 0, (sy_call_t *)nosys },			/* 232 = nosys */
+	{ 3, (sy_call_t *)dup3 },			/* 232 = dup3 */
 	{ 0, (sy_call_t *)nosys },			/* 233 = nosys */
 	{ 0, (sy_call_t *)nosys },			/* 234 = nosys */
 	{ 0, (sy_call_t *)nosys },			/* 235 = nosys */
Index: sys/kern/kern_descrip.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_descrip.c,v
retrieving revision 1.32.2.2
diff -u -r1.32.2.2 kern_descrip.c
--- kern_descrip.c	1996/12/21 19:04:24	1.32.2.2
+++ kern_descrip.c	1997/03/18 05:17:28
@@ -149,8 +149,69 @@
 }
 
 /*
- * Duplicate a file descriptor.
+ * Duplicate a file descriptor to a particular value in another process.
  */
+#ifndef _SYS_SYSPROTO_H_
+struct dup3_args {
+    u_int	from;
+    pid_t	target;
+    u_int	to;
+};
+#endif
+/* ARGSUSED */
+int
+dup3(p, uap, retval)
+    struct proc *p;
+    struct dup3_args *uap;
+    int *retval;
+{
+    struct filedesc *tdp, *fdp;
+    struct proc *t;
+    struct file *fp, *nfp;
+    int i, error;
+    u_int from = uap->from, to = uap->to;
+
+    /* Look up target process and make sure it exists, then set */
+    t = pfind(uap->target);
+    if (!t)
+	return (ESRCH);
+    tdp = t->p_fd;
+    fdp = p->p_fd;
+
+    /* Don't let non-root procs stomp other procs unless euid is the same */
+    /* XXX should also put in a check for parentage here in the non-root case XXX */
+    if (p->p_ucred->cr_uid && p->p_ucred->cr_uid != t->p_ucred->cr_uid)
+	return (EPERM);
+    
+    if (from >= fdp->fd_nfiles || fdp->fd_ofiles[from] == NULL)
+	return (EBADF);
+    if (to >= tdp->fd_nfiles) {
+	if ((error = fdalloc(t, to, &i)))
+	    return (error);
+	if (to != i)
+	    panic("dup3: fdalloc");
+	*retval = -1;
+    }
+    else if (tdp->fd_ofiles[to]) {
+	if ((error = fdalloc(p, 0, &i)))
+	    return (error);
+	fdp->fd_ofiles[i] = tdp->fd_ofiles[to];
+	fdp->fd_ofileflags[i] = tdp->fd_ofileflags[to] &~ UF_EXCLOSE;
+	tdp->fd_ofiles[to]->f_count++;
+	if (i > fdp->fd_lastfile)
+	    fdp->fd_lastfile = i;
+	*retval = i;
+    }
+    tdp->fd_ofiles[to] = fdp->fd_ofiles[from];
+    tdp->fd_ofileflags[to] = fdp->fd_ofileflags[from] &~ UF_EXCLOSE;
+    tdp->fd_ofiles[from]->f_count++;
+    if (to > tdp->fd_lastfile)
+	tdp->fd_lastfile = to;
+    return (0);
+}
+
+/*
+ * Duplicate a file descriptor.  */
 #ifndef _SYS_SYSPROTO_H_
 struct dup_args {
 	u_int	fd;
Index: sys/kern/syscalls.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/syscalls.c,v
retrieving revision 1.31
diff -u -r1.31 syscalls.c
--- syscalls.c	1996/09/19 19:48:34	1.31
+++ syscalls.c	1997/03/17 18:15:11
@@ -2,7 +2,7 @@
  * System call names.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from	Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp 
+ * created from	Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp 
  */
 
 char *syscallnames[] = {
@@ -253,7 +253,7 @@
 	"shmctl",			/* 229 = shmctl */
 	"shmdt",			/* 230 = shmdt */
 	"shmget",			/* 231 = shmget */
-	"#232",			/* 232 = nosys */
+	"dup3",			/* 232 = dup3 */
 	"#233",			/* 233 = nosys */
 	"#234",			/* 234 = nosys */
 	"#235",			/* 235 = nosys */
Index: sys/kern/syscalls.master
===================================================================
RCS file: /home/ncvs/src/sys/kern/syscalls.master,v
retrieving revision 1.29
diff -u -r1.29 syscalls.master
--- syscalls.master	1996/09/19 19:48:38	1.29
+++ syscalls.master	1997/03/17 18:06:01
@@ -364,7 +364,7 @@
 230	STD	BSD	{ int shmdt(void *shmaddr); }
 231	STD	BSD	{ int shmget(key_t key, int size, int shmflg); }
 ;
-232	UNIMPL	NOHIDE	nosys
+232	STD	BSD	{ int dup3(u_int from, pid_t target, u_int to); }
 233	UNIMPL	NOHIDE	nosys
 234	UNIMPL	NOHIDE	nosys
 235	UNIMPL	NOHIDE	nosys
Index: sys/sys/syscall-hide.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/syscall-hide.h,v
retrieving revision 1.25
diff -u -r1.25 syscall-hide.h
--- syscall-hide.h	1996/09/19 19:49:10	1.25
+++ syscall-hide.h	1997/03/17 18:15:11
@@ -2,7 +2,7 @@
  * System call hiders.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from	Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp 
+ * created from	Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp 
  */
 
 HIDE_POSIX(fork)
@@ -209,5 +209,6 @@
 HIDE_BSD(shmctl)
 HIDE_BSD(shmdt)
 HIDE_BSD(shmget)
+HIDE_BSD(dup3)
 HIDE_BSD(minherit)
 HIDE_BSD(rfork)
Index: sys/sys/syscall.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/syscall.h,v
retrieving revision 1.29
diff -u -r1.29 syscall.h
--- syscall.h	1996/09/19 19:49:12	1.29
+++ syscall.h	1997/03/17 18:15:11
@@ -2,7 +2,7 @@
  * System call numbers.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from	Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp 
+ * created from	Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp 
  */
 
 #define	SYS_syscall	0
@@ -203,6 +203,7 @@
 #define	SYS_shmctl	229
 #define	SYS_shmdt	230
 #define	SYS_shmget	231
+#define	SYS_dup3	232
 #define	SYS_minherit	250
 #define	SYS_rfork	251
 #define	SYS_MAXSYSCALL	252
Index: sys/sys/sysproto.h
===================================================================
RCS file: /home/ncvs/src/sys/sys/sysproto.h,v
retrieving revision 1.15
diff -u -r1.15 sysproto.h
--- sysproto.h	1996/09/19 19:49:13	1.15
+++ sysproto.h	1997/03/17 18:15:11
@@ -2,7 +2,7 @@
  * System call prototypes.
  *
  * DO NOT EDIT-- this file is automatically generated.
- * created from	Id: syscalls.master,v 1.28 1996/08/20 07:17:49 smpatel Exp 
+ * created from	Id: syscalls.master,v 1.29 1996/09/19 19:48:38 phk Exp 
  */
 
 #ifndef _SYS_SYSPROTO_H_
@@ -716,6 +716,11 @@
 	int size;
 	int shmflg;
 };
+struct	dup3_args {
+	u_int from;
+	pid_t target;
+	u_int to;
+};
 struct	minherit_args {
 	caddr_t addr;
 	size_t len;
@@ -891,6 +896,7 @@
 int	shmctl __P((struct proc *, struct shmctl_args *, int []));
 int	shmdt __P((struct proc *, struct shmdt_args *, int []));
 int	shmget __P((struct proc *, struct shmget_args *, int []));
+int	dup3 __P((struct proc *, struct dup3_args *, int []));
 int	minherit __P((struct proc *, struct minherit_args *, int []));
 int	rfork __P((struct proc *, struct rfork_args *, int []));
 




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?2273.858666923>