From owner-freebsd-arch Mon Jan 13 16:28:36 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7CC6E37B401 for ; Mon, 13 Jan 2003 16:28:35 -0800 (PST) Received: from canning.wemm.org (canning.wemm.org [192.203.228.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1228743F18 for ; Mon, 13 Jan 2003 16:28:35 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by canning.wemm.org (Postfix) with ESMTP id 1C8C12A89E; Mon, 13 Jan 2003 16:28:31 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: "Alan L. Cox" Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Virtual memory question In-Reply-To: <3E20B747.1FCA3B36@imimic.com> Date: Mon, 13 Jan 2003 16:28:31 -0800 From: Peter Wemm Message-Id: <20030114002831.1C8C12A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Alan L. Cox" wrote: > Matthew Dillon wrote: > > ... > > How about something like: > > > > getmemfd(). > > > > Roughly speaking, this is shm_open(3), which we currently implement > using files. .. which is expressly what I wanted to avoid. Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 17:54:26 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C9BA37B401 for ; Mon, 13 Jan 2003 17:54:25 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id BFF3243F13 for ; Mon, 13 Jan 2003 17:54:24 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0E1sO0i015614; Mon, 13 Jan 2003 17:54:24 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0E1sOe6015613; Mon, 13 Jan 2003 17:54:24 -0800 (PST) Date: Mon, 13 Jan 2003 17:54:24 -0800 (PST) From: Matthew Dillon Message-Id: <200301140154.h0E1sOe6015613@apollo.backplane.com> To: Peter Wemm Cc: "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114002831.1C8C12A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> > :> > getmemfd(). :> > :> :> Roughly speaking, this is shm_open(3), which we currently implement :> using files. : :.. which is expressly what I wanted to avoid. : :Cheers, :-Peter I can work up basic operation (i.e. getmemfd() and mmap()) in an hour or two. I'll have a patch set tonight. It looks utterly trivial. I think I'll generalize the system call, though, with two parameters: getsysfd(int type, off_t size) fd = getsysfd(SYSFD_MEMORY, 1024*1024); /* get 1MB memory object */ fd = getsysfd(SYSFD_MEMORY, -1); /* get infinite-sized memory object */ mmap(...) This way we can use it to implement and obtain other special purpose descriptors. For example, like the controlling terminal, or a timer, or a kqueue, etc etc etc. Since we seeem to be implementing special purpose FDs more and more these days. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 19:20:36 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E2AFE37B401 for ; Mon, 13 Jan 2003 19:20:34 -0800 (PST) Received: from cleitus.hosting.swbell.net (cleitus.hosting.swbell.net [216.100.99.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 30AC343EB2 for ; Mon, 13 Jan 2003 19:20:34 -0800 (PST) (envelope-from alc@imimic.com) Received: from imimic.com (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) by cleitus.hosting.swbell.net id WAA26523; Mon, 13 Jan 2003 22:20:25 -0500 (EST) [ConcentricHost SMTP Relay 1.14] Message-ID: <3E2381F8.85BB90A0@imimic.com> Date: Mon, 13 Jan 2003 21:20:24 -0600 From: "Alan L. Cox" Organization: iMimic Networking, Inc. X-Mailer: Mozilla 4.8 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Peter Wemm Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114002831.1C8C12A89E@canning.wemm.org> Content-Type: text/plain; charset=x-user-defined Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Peter Wemm wrote: > > "Alan L. Cox" wrote: > > Matthew Dillon wrote: > > > ... > > > How about something like: > > > > > > getmemfd(). > > > > > > > Roughly speaking, this is shm_open(3), which we currently implement > > using files. > > .. which is expressly what I wanted to avoid. > Your response is ambiguous. :-) It doesn't say whether you want to avoid shm_open(3) the interface or rather FreeBSD's implementation of it. Personally, I'm all for changing the implementation in the way Matt describes, but I haven't yet heard a rationale for a new interface. Specifically, the interface proposed thus far could be emulated by fd = shm_open("unique name", ...); shm_unlink("unique name"); The spec also seems to makes the creation of unique names easy: "If name does not begin with the slash character, the effect is implementation-dependent." So, a per-process name space is allowed for names not beginning with slash. Furthermore, the only operations that I know of on a "path" are shm_open() and shm_unlink(), and my reading of those was that a hash table keyed on the "path" was a legal implementation. In summary, a new implementation would be good, but I haven't seen the rationale for a new interface, especially given that shm_open(3) is an existing standard. Regards, Alan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 19:39:33 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2979B37B407 for ; Mon, 13 Jan 2003 19:39:32 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id C381143ED8 for ; Mon, 13 Jan 2003 19:39:31 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0E3dV0i073161; Mon, 13 Jan 2003 19:39:31 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0E3dVQa073160; Mon, 13 Jan 2003 19:39:31 -0800 (PST) Date: Mon, 13 Jan 2003 19:39:31 -0800 (PST) From: Matthew Dillon Message-Id: <200301140339.h0E3dVQa073160@apollo.backplane.com> To: "Alan L. Cox" Cc: Peter Wemm , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Well, if we need kernel support anyway to get out from under the file-backed issue (this is what Peter is trying to avoid), there would be a system call anway and Peter could just use it. So he gets his direct call for free even though we would also keep the shm_open() interface and implement it or partially implement it with the new system call. I see two major deficiencies with shm_open(). (1) There is no way to say "give me a new memory area". i.e. passing a path of NULL is not legal. On the otherhand, after reading the manual page it is clear that you *CAN* give shm_open() a non-/ name and libc could manage the namespace/descriptor association internally. Still, if you want an unassociated object we should allow NULL. (2) I don't see how/where one specifies the size of the memory object in shm_open(). Does this mean we have to implement ftruncate()? -Matt Matthew Dillon :Personally, I'm all for changing the implementation in the way Matt :describes, but I haven't yet heard a rationale for a new interface. :Specifically, the interface proposed thus far could be emulated by : : fd = shm_open("unique name", ...); : shm_unlink("unique name"); : :The spec also seems to makes the creation of unique names easy: "If name :does not begin with the slash character, the effect is :implementation-dependent." So, a per-process name space is allowed for :names not beginning with slash. : :Furthermore, the only operations that I know of on a "path" are :shm_open() and shm_unlink(), and my reading of those was that a hash :table keyed on the "path" was a legal implementation. : :In summary, a new implementation would be good, but I haven't seen the :rationale for a new interface, especially given that shm_open(3) is an :existing standard. : :Regards, :Alan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 20:11:34 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 01CAD37B405 for ; Mon, 13 Jan 2003 20:11:33 -0800 (PST) Received: from eumenes.hosting.swbell.net (eumenes.hosting.swbell.net [216.100.98.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 25EB643F3F for ; Mon, 13 Jan 2003 20:11:32 -0800 (PST) (envelope-from alc@imimic.com) Received: from imimic.com (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) by eumenes.hosting.swbell.net id XAA19818; Mon, 13 Jan 2003 23:11:28 -0500 (EST) [ConcentricHost SMTP Relay 1.14] Message-ID: <3E238DEF.14DFA7E1@imimic.com> Date: Mon, 13 Jan 2003 22:11:27 -0600 From: "Alan L. Cox" Organization: iMimic Networking, Inc. X-Mailer: Mozilla 4.8 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: Peter Wemm , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> <200301140339.h0E3dVQa073160@apollo.backplane.com> Content-Type: text/plain; charset=x-user-defined Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > ... > I see two major deficiencies with shm_open(). > > (1) There is no way to say "give me a new memory area". i.e. passing > a path of NULL is not legal. On the otherhand, after reading the > manual page it is clear that you *CAN* give shm_open() a > non-/ name and libc could manage the namespace/descriptor > association internally. Still, if you want an unassociated > object we should allow NULL. It's probably better to look at the actual specification rather than our manual page: http://www.opengroup.org/onlinepubs/7908799/xsh/shm_open.html. I think it better describes the things that are left unspecified (and allow a great deal of flexibility in the implementation). As far as the name space goes, I still tend to think that an in-kernel hash table is the way to go. If the "path" begins with a '/', it's the entire key. Otherwise, you add something identifying the process to the key. I'm happy with the NULL path as an extension to this interface. > (2) I don't see how/where one specifies the size of the memory object > in shm_open(). Does this mean we have to implement ftruncate()? I think the size is implied by the mmap()ing. A second, larger mmap()ing would have to grow the object. An object should never shrink. Regards, Alan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 20:12: 1 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6199D37B401 for ; Mon, 13 Jan 2003 20:11:43 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8651D43F3F for ; Mon, 13 Jan 2003 20:11:42 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0E4Bg0i078033; Mon, 13 Jan 2003 20:11:42 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0E4BgpN078032; Mon, 13 Jan 2003 20:11:42 -0800 (PST) Date: Mon, 13 Jan 2003 20:11:42 -0800 (PST) From: Matthew Dillon Message-Id: <200301140411.h0E4BgpN078032@apollo.backplane.com> To: "Alan L. Cox" Cc: Peter Wemm , arch@FreeBSD.ORG Subject: getsysfd() patch #1 (Re: Virtual memory question) References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This is a first-attempt workup of getsysfd(). See? I told ya it was trivial! This isn't everything. If we really want to do this right we need to create a filesystem inode type to represent a memory rendezvous, similar to how we represent a FIFO or SOCKET rendezvous. If we do that then we can support all shm_open() situations using this new call. I have only done a small amount of testing, I have not double checked that I handle the reference counts properly and I had to reorganize mmap() quite a bit (in fact, it looks like someone did a bunch of rewriting in the mmap()/vm_mmap() code and we really need to rewrite the layering). Here is a test program. The patch is below this program. This should be considered a 'test' patch for the moment, my heart isn't set on the interface. e.g. perhaps we want to add additional arguments to make it more useful/generic. -Matt #include #include #include #include #include int main(int ac, char **av) { int fd = getsysfd(SYSFD_MEMORY, 1024*1024); char *ptr1; char *ptr2; printf("fd = %d %d %s\n", fd, errno, strerror(errno)); errno = 0; ptr1 = mmap(NULL, 1024*1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0); printf("mmap: %p (%s)\n", ptr1, strerror(errno)); errno = 0; ptr2 = mmap(NULL, 1024*1024, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); printf("mmap: %p (%s)\n", ptr2, strerror(errno)); close(fd); ptr1[0] = 1; ptr1[1024*1024-1] = 2; if (fork() == 0) { printf("CONTENTS %d %d\n", ptr2[0], ptr2[1024*1024-1]); ptr2[0] = 2; /* modify private mapping */ ptr1[1024*1024-1] = 3; /* modify original */ } sleep(1); /* SHOULD BE 1 3 */ printf("ORIGCONTENTS %d %d\n", ptr1[0], ptr1[1024*1024-1]); return(0); } Index: conf/files =================================================================== RCS file: /home/ncvs/src/sys/conf/files,v retrieving revision 1.744 diff -u -r1.744 files --- conf/files 8 Jan 2003 23:36:59 -0000 1.744 +++ conf/files 14 Jan 2003 02:30:47 -0000 @@ -1055,6 +1055,7 @@ kern/subr_xxx.c standard kern/sys_generic.c standard kern/sys_pipe.c standard +kern/sys_sysfd.c standard kern/sys_process.c standard kern/sys_socket.c standard kern/syscalls.c optional witness Index: kern/init_sysent.c =================================================================== RCS file: /home/ncvs/src/sys/kern/init_sysent.c,v retrieving revision 1.146 diff -u -r1.146 init_sysent.c --- kern/init_sysent.c 8 Jan 2003 04:57:52 -0000 1.146 +++ kern/init_sysent.c 14 Jan 2003 01:58:05 -0000 @@ -2,7 +2,7 @@ * System call switch table. * * DO NOT EDIT-- this file is automatically generated. - * $FreeBSD: src/sys/kern/init_sysent.c,v 1.146 2003/01/08 04:57:52 davidxu Exp $ + * $FreeBSD$ * created from FreeBSD: src/sys/kern/syscalls.master,v 1.140 2003/01/04 11:41:12 davidxu Exp */ @@ -457,4 +457,5 @@ { SYF_MPSAFE | AS(__acl_set_link_args), (sy_call_t *)__acl_set_link }, /* 426 = __acl_set_link */ { SYF_MPSAFE | AS(__acl_delete_link_args), (sy_call_t *)__acl_delete_link }, /* 427 = __acl_delete_link */ { SYF_MPSAFE | AS(__acl_aclcheck_link_args), (sy_call_t *)__acl_aclcheck_link }, /* 428 = __acl_aclcheck_link */ + { SYF_MPSAFE | AS(getsysfd_args), (sy_call_t *)getsysfd }, /* 429 = getsysfd */ }; Index: kern/sys_sysfd.c =================================================================== RCS file: kern/sys_sysfd.c diff -N kern/sys_sysfd.c --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ kern/sys_sysfd.c 14 Jan 2003 03:47:53 -0000 @@ -0,0 +1,208 @@ +/* + * KERN/SYS_SYSFD.C + * + * $FreeBSD$ + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * interfaces to the outside world + */ +static fo_rdwr_t memfd_read; +static fo_rdwr_t memfd_write; +static fo_ioctl_t memfd_ioctl; +static fo_poll_t memfd_poll; +static fo_stat_t memfd_stat; +static fo_close_t memfd_close; + +static struct fileops memfdops = { + memfd_read, memfd_write, memfd_ioctl, memfd_poll, NULL, + memfd_stat, memfd_close +}; + +/* + * The getsysfd() system call. getsysfd(int type, off_t size) + * + * SYSFD_MEMORY - Return a descriptor which can be mmap()'d, + * representing anonymous, shareable swap-backed + * memory. + * + */ + +int +getsysfd(struct thread *td, struct getsysfd_args *uap) +{ + int error; + int fd; + vm_pindex_t npages; + struct file *fp; + struct filedesc *fdp; + + /* + * Validate the size + */ + printf("GETSYSFD %d %lld\n", uap->type, (long long)uap->size); + if (uap->size < 0) + return(EINVAL); + npages = round_page(uap->size) >> PAGE_SHIFT; + + /* + * Allocate a new descriptor. the descriptor will be returned with a + * reference associated with fd_ofiles[fd]. + * + * XXX falloc() really should return with two references on the desc, + * not one, so it can't be ripped out from under us. + */ + error = falloc(td, &fp, &fd); + if (error) + return(error); + fhold(fp); + FILE_LOCK(fp); + fp->f_flag = FREAD | FWRITE; + + switch(uap->type) { + case SYSFD_MEMORY: + fp->f_type = DTYPE_MEMFD; + fp->f_data = vm_object_allocate(OBJT_DEFAULT, npages); + fp->f_ops = &memfdops; + if (fp->f_data == NULL) + error = ENOMEM; + break; + default: + error = EINVAL; + break; + } + FILE_UNLOCK(fp); + if (error) { + fdp = td->td_proc->p_fd; + FILEDESC_LOCK(fdp); + if (fdp->fd_ofiles[fd] == fp) { + fdp->fd_ofiles[fd] = NULL; + fdp->fd_ofileflags[fd] = 0; + fdrop(fp, td); /* drop ofiles[] array reference */ + if (fd < fdp->fd_freefile) + fdp->fd_freefile = fd; + } + FILEDESC_UNLOCK(fdp); + /* closef(fp, td); NOT NECESSARY */ + } else { + td->td_retval[0] = fd; + } + fdrop(fp, td); /* drop our reference */ + return(error); +} + +/* ARGSUSED */ +static int +memfd_read(fp, uio, active_cred, flags, td) + struct file *fp; + struct uio *uio; + struct ucred *active_cred; + struct thread *td; + int flags; +{ + return(EOPNOTSUPP); +} + +static int +memfd_write(fp, uio, active_cred, flags, td) + struct file *fp; + struct uio *uio; + struct ucred *active_cred; + struct thread *td; + int flags; +{ + return(EOPNOTSUPP); +} + +/* + * we implement a very minimal set of ioctls for compatibility with sockets. + */ +static int +memfd_ioctl(fp, cmd, data, active_cred, td) + struct file *fp; + u_long cmd; + void *data; + struct ucred *active_cred; + struct thread *td; +{ + return(EINVAL); +} + +static int +memfd_poll(fp, events, active_cred, td) + struct file *fp; + int events; + struct ucred *active_cred; + struct thread *td; +{ + return(0); +} + +/* + * We shouldn't need locks here as we're doing a read and this should + * be a natural race. + */ +static int +memfd_stat(fp, ub, active_cred, td) + struct file *fp; + struct stat *ub; + struct ucred *active_cred; + struct thread *td; +{ + return(EOPNOTSUPP); +} + +/* ARGSUSED */ +static int +memfd_close(fp, td) + struct file *fp; + struct thread *td; +{ + vm_object_t object; + + FILE_LOCK(fp); + object = fp->f_data; + fp->f_data = NULL; + FILE_UNLOCK(fp); + + mtx_lock(&Giant); + if (object) + vm_object_deallocate(object); + mtx_unlock(&Giant); + return(0); +} + Index: kern/syscalls.c =================================================================== RCS file: /home/ncvs/src/sys/kern/syscalls.c,v retrieving revision 1.132 diff -u -r1.132 syscalls.c --- kern/syscalls.c 8 Jan 2003 04:57:52 -0000 1.132 +++ kern/syscalls.c 14 Jan 2003 01:58:05 -0000 @@ -2,7 +2,7 @@ * System call names. * * DO NOT EDIT-- this file is automatically generated. - * $FreeBSD: src/sys/kern/syscalls.c,v 1.132 2003/01/08 04:57:52 davidxu Exp $ + * $FreeBSD$ * created from FreeBSD: src/sys/kern/syscalls.master,v 1.140 2003/01/04 11:41:12 davidxu Exp */ @@ -436,4 +436,5 @@ "__acl_set_link", /* 426 = __acl_set_link */ "__acl_delete_link", /* 427 = __acl_delete_link */ "__acl_aclcheck_link", /* 428 = __acl_aclcheck_link */ + "getsysfd", /* 429 = getsysfd */ }; Index: kern/syscalls.master =================================================================== RCS file: /home/ncvs/src/sys/kern/syscalls.master,v retrieving revision 1.140 diff -u -r1.140 syscalls.master --- kern/syscalls.master 4 Jan 2003 11:41:12 -0000 1.140 +++ kern/syscalls.master 14 Jan 2003 01:58:03 -0000 @@ -621,6 +621,7 @@ acl_type_t type); } 428 MSTD BSD { int __acl_aclcheck_link(const char *path, \ acl_type_t type, struct acl *aclp); } +429 MSTD BSD { int getsysfd(int type, off_t size); } ; Please copy any additions and changes to the following compatability tables: ; sys/ia64/ia32/syscalls.master (take a best guess) Index: sys/file.h =================================================================== RCS file: /home/ncvs/src/sys/sys/file.h,v retrieving revision 1.59 diff -u -r1.59 file.h --- sys/file.h 13 Jan 2003 00:28:55 -0000 1.59 +++ sys/file.h 14 Jan 2003 02:04:13 -0000 @@ -62,6 +62,7 @@ #define DTYPE_FIFO 4 /* fifo (named pipe) */ #define DTYPE_KQUEUE 5 /* event queue */ #define DTYPE_CRYPTO 6 /* crypto */ +#define DTYPE_MEMFD 7 /* memory descriptor */ #ifdef _KERNEL Index: sys/syscall.h =================================================================== RCS file: /home/ncvs/src/sys/sys/syscall.h,v retrieving revision 1.130 diff -u -r1.130 syscall.h --- sys/syscall.h 8 Jan 2003 04:57:52 -0000 1.130 +++ sys/syscall.h 14 Jan 2003 01:58:05 -0000 @@ -2,7 +2,7 @@ * System call numbers. * * DO NOT EDIT-- this file is automatically generated. - * $FreeBSD: src/sys/sys/syscall.h,v 1.130 2003/01/08 04:57:52 davidxu Exp $ + * $FreeBSD$ * created from FreeBSD: src/sys/kern/syscalls.master,v 1.140 2003/01/04 11:41:12 davidxu Exp */ @@ -334,4 +334,5 @@ #define SYS___acl_set_link 426 #define SYS___acl_delete_link 427 #define SYS___acl_aclcheck_link 428 -#define SYS_MAXSYSCALL 429 +#define SYS_getsysfd 429 +#define SYS_MAXSYSCALL 430 Index: sys/syscall.mk =================================================================== RCS file: /home/ncvs/src/sys/sys/syscall.mk,v retrieving revision 1.85 diff -u -r1.85 syscall.mk --- sys/syscall.mk 8 Jan 2003 04:57:52 -0000 1.85 +++ sys/syscall.mk 14 Jan 2003 01:58:05 -0000 @@ -1,6 +1,6 @@ # FreeBSD system call names. # DO NOT EDIT-- this file is automatically generated. -# $FreeBSD: src/sys/sys/syscall.mk,v 1.85 2003/01/08 04:57:52 davidxu Exp $ +# $FreeBSD$ # created from FreeBSD: src/sys/kern/syscalls.master,v 1.140 2003/01/04 11:41:12 davidxu Exp MIASM = \ syscall.o \ @@ -279,4 +279,5 @@ __acl_get_link.o \ __acl_set_link.o \ __acl_delete_link.o \ - __acl_aclcheck_link.o + __acl_aclcheck_link.o \ + getsysfd.o Index: sys/sysfd.h =================================================================== RCS file: sys/sysfd.h diff -N sys/sysfd.h --- /dev/null 1 Jan 1970 00:00:00 -0000 +++ sys/sysfd.h 14 Jan 2003 04:06:19 -0000 @@ -0,0 +1,21 @@ +/* + * $FreeBSD$ + */ + +#ifndef _SYS_SYSFD_H_ +#define _SYS_SYSFD_H_ + +#define SYSFD_MEMORY 1 +#ifdef NOTYET +#define SYSFD_TIMER_SECS 2 +#define SYSFD_TIMER_TENS 3 +#define SYSFD_TIMER_MICRO 4 +#define SYSFD_TIMER_SYS 5 +#define SYSFD_TIMER_REAL 6 +#define SYSFD_TIMER_VIRT 7 +#endif + +#endif /* _SYS_SYSFD_H_ */ + +extern int getsysfd(int type, off_t size); + Index: sys/sysproto.h =================================================================== RCS file: /home/ncvs/src/sys/sys/sysproto.h,v retrieving revision 1.123 diff -u -r1.123 sysproto.h --- sys/sysproto.h 8 Jan 2003 04:57:53 -0000 1.123 +++ sys/sysproto.h 14 Jan 2003 01:58:05 -0000 @@ -2,7 +2,7 @@ * System call prototypes. * * DO NOT EDIT-- this file is automatically generated. - * $FreeBSD: src/sys/sys/sysproto.h,v 1.123 2003/01/08 04:57:53 davidxu Exp $ + * $FreeBSD$ * created from FreeBSD: src/sys/kern/syscalls.master,v 1.140 2003/01/04 11:41:12 davidxu Exp */ @@ -1223,6 +1223,10 @@ char type_l_[PADL_(acl_type_t)]; acl_type_t type; char type_r_[PADR_(acl_type_t)]; char aclp_l_[PADL_(struct acl *)]; struct acl * aclp; char aclp_r_[PADR_(struct acl *)]; }; +struct getsysfd_args { + char type_l_[PADL_(int)]; int type; char type_r_[PADR_(int)]; + char size_l_[PADL_(off_t)]; off_t size; char size_r_[PADR_(off_t)]; +}; int nosys(struct thread *, struct nosys_args *); void sys_exit(struct thread *, struct sys_exit_args *); int fork(struct thread *, struct fork_args *); @@ -1499,6 +1503,7 @@ int __acl_set_link(struct thread *, struct __acl_set_link_args *); int __acl_delete_link(struct thread *, struct __acl_delete_link_args *); int __acl_aclcheck_link(struct thread *, struct __acl_aclcheck_link_args *); +int getsysfd(struct thread *, struct getsysfd_args *); #ifdef COMPAT_43 Index: vm/vm_extern.h =================================================================== RCS file: /home/ncvs/src/sys/vm/vm_extern.h,v retrieving revision 1.59 diff -u -r1.59 vm_extern.h --- vm/vm_extern.h 24 Jul 2002 19:47:56 -0000 1.59 +++ vm/vm_extern.h 14 Jan 2003 03:12:06 -0000 @@ -80,6 +80,7 @@ void vm_forkproc(struct thread *, struct proc *, struct thread *, int); void vm_waitproc(struct proc *); int vm_mmap(vm_map_t, vm_offset_t *, vm_size_t, vm_prot_t, vm_prot_t, int, void *, vm_ooffset_t); +int vm_mmap_object(vm_map_t, vm_offset_t *, vm_size_t, vm_prot_t, vm_prot_t, int, vm_object_t, vm_ooffset_t); vm_offset_t vm_page_alloc_contig(vm_offset_t, vm_offset_t, vm_offset_t, vm_offset_t); void vm_set_page_size(void); struct vmspace *vmspace_alloc(vm_offset_t, vm_offset_t); Index: vm/vm_mmap.c =================================================================== RCS file: /home/ncvs/src/sys/vm/vm_mmap.c,v retrieving revision 1.155 diff -u -r1.155 vm_mmap.c --- vm/vm_mmap.c 13 Jan 2003 00:28:55 -0000 1.155 +++ vm/vm_mmap.c 14 Jan 2003 03:55:15 -0000 @@ -201,7 +201,7 @@ struct thread *td; struct mmap_args *uap; { - struct file *fp = NULL; + struct file *fp; struct vnode *vp; vm_offset_t addr; vm_size_t size, pageoff; @@ -264,49 +264,101 @@ return (EINVAL); if (addr + size < addr) return (EINVAL); - } - /* - * XXX for non-fixed mappings where no hint is provided or - * the hint would fall in the potential heap space, - * place it after the end of the largest possible heap. - * - * There should really be a pmap call to determine a reasonable - * location. - */ - else if (addr == 0 || + } else if (addr == 0 || (addr >= round_page((vm_offset_t)vms->vm_taddr) && - addr < round_page((vm_offset_t)vms->vm_daddr + maxdsiz))) + addr < round_page((vm_offset_t)vms->vm_daddr + maxdsiz))) { + /* + * XXX for non-fixed mappings where no hint is provided or + * the hint would fall in the potential heap space, + * place it after the end of the largest possible heap. + * + * There should really be a pmap call to determine a reasonable + * location. + */ addr = round_page((vm_offset_t)vms->vm_daddr + maxdsiz); + } mtx_lock(&Giant); /* syscall marked mp-safe but isn't */ + + /* + * Do not allow more then a certain number of vm_map_entry structures + * per process. Scale with the number of rforks sharing the map + * to make the limit reasonable for threads. + */ + if (max_proc_mmap && + vms->vm_map.nentries >= max_proc_mmap * vms->vm_refcnt) { + error = ENOMEM; + goto done; + } + + /* + * Extract the file descriptor (if not an anonymous mmap) + */ if (flags & MAP_ANON) { /* * Mapping blank space is trivial. */ - handle = NULL; maxprot = VM_PROT_ALL; pos = 0; } else { /* - * Mapping file, get fp for validation. Obtain vnode and make - * sure it is of appropriate type. - * don't let the descriptor disappear on us if we block + * Mapping a file descriptor. Reference the fp so it does + * not go away on us. */ if ((error = fget(td, uap->fd, &fp)) != 0) goto done; - if (fp->f_type != DTYPE_VNODE) { - error = EINVAL; - goto done; - } /* - * POSIX shared-memory objects are defined to have - * kernel persistence, and are not defined to support - * read(2)/write(2) -- or even open(2). Thus, we can - * use MAP_ASYNC to trade on-disk coherence for speed. - * The shm_open(3) library routine turns on the FPOSIXSHM - * flag to request this behavior. + * Ensure that file and memory protections are + * compatible. Note that we only worry about + * writability if mapping is shared; in this case, + * current and max prot are dictated by the open file. + * XXX use the vnode instead? Problem is: what + * credentials do we use for determination? What if + * proc does a setuid? */ + maxprot = VM_PROT_EXECUTE; /* ??? */ + if (fp->f_flag & FREAD) { + maxprot |= VM_PROT_READ; + } else if (prot & PROT_READ) { + error = EACCES; + goto done; + } + } + + /* + * Handle MEMFD descriptors. These reference the VM object directly. + */ + if (fp && fp->f_type == DTYPE_MEMFD && fp->f_data) { + mtx_unlock(&Giant); + obj = fp->f_data; + vm_object_reference(obj); + error = vm_mmap_object(&vms->vm_map, &addr, size, prot, + maxprot, flags, obj, pos); + if (error == 0) + td->td_retval[0] = (register_t) (addr + pageoff); + mtx_lock(&Giant); + vm_object_deallocate(obj); + goto done2; + } + + /* + * Otherwise it must be an anonymous mapping or a VNODE + */ + if (fp != NULL && fp->f_type != DTYPE_VNODE) { + error = EINVAL; + goto done; + } + + /* + * POSIX shared-memory objects are defined to have + * kernel persistence, and are not defined to support + * read(2)/write(2) -- or even open(2). Thus, we can + * use MAP_ASYNC to trade on-disk coherence for speed. + * The shm_open(3) library routine turns on the FPOSIXSHM + * flag to request this behavior. + */ + if (fp) { if (fp->f_flag & FPOSIXSHM) flags |= MAP_NOSYNC; vp = fp->f_data; @@ -363,22 +415,7 @@ error = EINVAL; goto done; } - /* - * Ensure that file and memory protections are - * compatible. Note that we only worry about - * writability if mapping is shared; in this case, - * current and max prot are dictated by the open file. - * XXX use the vnode instead? Problem is: what - * credentials do we use for determination? What if - * proc does a setuid? - */ - maxprot = VM_PROT_EXECUTE; /* ??? */ - if (fp->f_flag & FREAD) { - maxprot |= VM_PROT_READ; - } else if (prot & PROT_READ) { - error = EACCES; - goto done; - } + /* * If we are sharing potential changes (either via * MAP_SHARED or via the implicit sharing of character @@ -414,17 +451,8 @@ handle = (void *)vp; } - } - - /* - * Do not allow more then a certain number of vm_map_entry structures - * per process. Scale with the number of rforks sharing the map - * to make the limit reasonable for threads. - */ - if (max_proc_mmap && - vms->vm_map.nentries >= max_proc_mmap * vms->vm_refcnt) { - error = ENOMEM; - goto done; + } else { + handle = NULL; } mtx_unlock(&Giant); @@ -444,10 +472,10 @@ done: if (vp) vput(vp); +done2: mtx_unlock(&Giant); if (fp) fdrop(fp, td); - return (error); } @@ -1272,3 +1300,102 @@ return (EINVAL); } } + +/* + * vm_mmap_object() + * + * MPSAFE + * + * Internal version of mmap that directly operates on a VM object. + * Currently used by mmap. + */ +int +vm_mmap_object(vm_map_t map, vm_offset_t *addr, vm_size_t size, vm_prot_t prot, + vm_prot_t maxprot, int flags, vm_object_t object, vm_ooffset_t foff) +{ + boolean_t fitit; + int rv = KERN_SUCCESS; + int docow; + struct thread *td = curthread; + + if (size == 0) + return (0); + + size = round_page(size); + + if (td->td_proc->p_vmspace->vm_map.size + size > + td->td_proc->p_rlimit[RLIMIT_VMEM].rlim_cur) { + return(ENOMEM); + } + + /* + * We currently can only deal with page aligned file offsets. + * The check is here rather than in the syscall because the + * kernel calls this function internally for other mmaping + * operations (such as in exec) and non-aligned offsets will + * cause pmap inconsistencies...so we want to be sure to + * disallow this in all cases. + */ + if (foff & PAGE_MASK) + return (EINVAL); + + if ((flags & MAP_FIXED) == 0) { + fitit = TRUE; + *addr = round_page(*addr); + } else { + if (*addr != trunc_page(*addr)) + return (EINVAL); + fitit = FALSE; + (void) vm_map_remove(map, *addr, *addr + size); + } + + docow = MAP_PREFAULT_PARTIAL; + + if ((flags & (MAP_ANON|MAP_SHARED)) == 0) + docow |= MAP_COPY_ON_WRITE; + if (flags & MAP_NOCORE) + docow |= MAP_DISABLE_COREDUMP; + +#if defined(VM_PROT_READ_IS_EXEC) + if (prot & VM_PROT_READ) + prot |= VM_PROT_EXECUTE; + + if (maxprot & VM_PROT_READ) + maxprot |= VM_PROT_EXECUTE; +#endif + + if (fitit) + *addr = pmap_addr_hint(object, *addr, size); + + vm_object_reference(object); + if (flags & MAP_STACK) { + rv = vm_map_stack (map, *addr, size, prot, maxprot, docow); + } else { + rv = vm_map_find(map, object, foff, addr, size, fitit, + prot, maxprot, docow); + } + if (rv != KERN_SUCCESS) + vm_object_deallocate(object); + + switch(rv) { + case KERN_SUCCESS: + if (flags & MAP_SHARED) { + /* + * Shared memory is also shared with children. + */ + rv = vm_map_inherit(map, *addr, *addr + size, + VM_INHERIT_SHARE); + if (rv != KERN_SUCCESS) + (void)vm_map_remove(map, *addr, *addr + size); + } + return(0); + case KERN_INVALID_ADDRESS: + case KERN_NO_SPACE: + return (ENOMEM); + case KERN_PROTECTION_FAILURE: + return (EACCES); + default: + return (EINVAL); + } +} + To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 20:14:15 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 50B0137B401 for ; Mon, 13 Jan 2003 20:14:13 -0800 (PST) Received: from canning.wemm.org (canning.wemm.org [192.203.228.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0BF9043F13 for ; Mon, 13 Jan 2003 20:14:13 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by canning.wemm.org (Postfix) with ESMTP id DDFC32A89E; Mon, 13 Jan 2003 20:14:07 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: "Alan L. Cox" Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Virtual memory question In-Reply-To: <3E2381F8.85BB90A0@imimic.com> Date: Mon, 13 Jan 2003 20:14:07 -0800 From: Peter Wemm Message-Id: <20030114041407.DDFC32A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Alan L. Cox" wrote: > Peter Wemm wrote: > > > > "Alan L. Cox" wrote: > > > Matthew Dillon wrote: > > > > ... > > > > How about something like: > > > > > > > > getmemfd(). > > > > > > > > > > Roughly speaking, this is shm_open(3), which we currently implement > > > using files. > > > > .. which is expressly what I wanted to avoid. > > > > Your response is ambiguous. :-) It doesn't say whether you want to > avoid shm_open(3) the interface or rather FreeBSD's implementation of > it. Personally, I'm all for changing the implementation in the way Matt > describes, but I haven't yet heard a rationale for a new interface. > Specifically, the interface proposed thus far could be emulated by > > fd = shm_open("unique name", ...); > shm_unlink("unique name"); > > The spec also seems to makes the creation of unique names easy: "If name > does not begin with the slash character, the effect is > implementation-dependent." So, a per-process name space is allowed for > names not beginning with slash. > > Furthermore, the only operations that I know of on a "path" are > shm_open() and shm_unlink(), and my reading of those was that a hash > table keyed on the "path" was a legal implementation. > > In summary, a new implementation would be good, but I haven't seen the > rationale for a new interface, especially given that shm_open(3) is an > existing standard. Sorry about the ambiguity. My problem with the shm_*() calls is that the API is pretty heavily tied to the file system. If there is a way to avoid that, then fine. It looks like one needs to ftruncate() it to resize a shm_open() object. Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 20:15:21 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BC9437B401 for ; Mon, 13 Jan 2003 20:15:20 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id D7A7743EB2 for ; Mon, 13 Jan 2003 20:15:19 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0E4FE0i078073; Mon, 13 Jan 2003 20:15:14 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0E4FEuJ078072; Mon, 13 Jan 2003 20:15:14 -0800 (PST) Date: Mon, 13 Jan 2003 20:15:14 -0800 (PST) From: Matthew Dillon Message-Id: <200301140415.h0E4FEuJ078072@apollo.backplane.com> To: "Alan L. Cox" Cc: Peter Wemm , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> <200301140339.h0E3dVQa073160@apollo.backplane.com> <3E238DEF.14DFA7E1@imimic.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :I'm happy with the NULL path as an extension to this interface. : :> (2) I don't see how/where one specifies the size of the memory object :> in shm_open(). Does this mean we have to implement ftruncate()? : :I think the size is implied by the mmap()ing. A second, larger :mmap()ing would have to grow the object. An object should never shrink. : :Regards, :Alan "Ick". But it would be extremely easy to implement that sort of auto-grow. I'll read up on the shm_open() spec. Personally speaking I'd like an explicit size to be specified during the open/creation phase. It occurs to me that we could use this to implement far better MFS / MD support then we have now. Hrmmm. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 20:26: 2 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 045A537B401 for ; Mon, 13 Jan 2003 20:26:01 -0800 (PST) Received: from eumenes.hosting.swbell.net (eumenes.hosting.swbell.net [216.100.98.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 33E1143ED8 for ; Mon, 13 Jan 2003 20:26:00 -0800 (PST) (envelope-from alc@imimic.com) Received: from imimic.com (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) by eumenes.hosting.swbell.net id XAA02435; Mon, 13 Jan 2003 23:25:52 -0500 (EST) [ConcentricHost SMTP Relay 1.14] Message-ID: <3E239150.9FC363DD@imimic.com> Date: Mon, 13 Jan 2003 22:25:52 -0600 From: "Alan L. Cox" Organization: iMimic Networking, Inc. X-Mailer: Mozilla 4.8 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Peter Wemm Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114041407.DDFC32A89E@canning.wemm.org> Content-Type: text/plain; charset=x-user-defined Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Peter Wemm wrote: > ... > Sorry about the ambiguity. No problem. :-) > My problem with the shm_*() calls is that the API is pretty heavily tied to > the file system. If there is a way to avoid that, then fine. It looks > like one needs to ftruncate() it to resize a shm_open() object. From the spec (on the web page that I mentioned): "The name argument points to a string naming a shared memory object. It is unspecified whether the name appears in the file system and is visible to other functions that take pathnames as arguments." I don't think ftruncate() is necessary. The underlying shm object can be grown implicitly according to its mmap()ings. I do not, however, know of a way to shrink an shm object. Alan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 20:33:49 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5DBDF37B53A for ; Mon, 13 Jan 2003 20:33:48 -0800 (PST) Received: from eumenes.hosting.swbell.net (eumenes.hosting.swbell.net [216.100.98.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 98B0D43F13 for ; Mon, 13 Jan 2003 20:33:47 -0800 (PST) (envelope-from alc@imimic.com) Received: from imimic.com (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) by eumenes.hosting.swbell.net id XAA08863; Mon, 13 Jan 2003 23:33:45 -0500 (EST) [ConcentricHost SMTP Relay 1.14] Message-ID: <3E239327.9E4F6E84@imimic.com> Date: Mon, 13 Jan 2003 22:33:43 -0600 From: "Alan L. Cox" Organization: iMimic Networking, Inc. X-Mailer: Mozilla 4.8 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: Peter Wemm , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> <200301140339.h0E3dVQa073160@apollo.backplane.com> <3E238DEF.14DFA7E1@imimic.com> <200301140415.h0E4FEuJ078072@apollo.backplane.com> Content-Type: text/plain; charset=x-user-defined Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > ... > "Ick". But it would be extremely easy to implement that sort of > auto-grow. I'll read up on the shm_open() spec. Personally speaking > I'd like an explicit size to be specified during the open/creation phase. > Ick!?! We already and routinely grow vm objects. :-) Alan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jan 13 22: 6:22 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 444B237B401 for ; Mon, 13 Jan 2003 22:06:21 -0800 (PST) Received: from soulshock.mail.pas.earthlink.net (soulshock.mail.pas.earthlink.net [207.217.120.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9FD6043EB2 for ; Mon, 13 Jan 2003 22:06:20 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from heron (heron.mail.pas.earthlink.net [207.217.120.189]) by soulshock.mail.pas.earthlink.net (8.11.6+Sun/8.11.6) with ESMTP id h0E5WcH23113; Mon, 13 Jan 2003 21:32:38 -0800 (PST) Received: from pool0481.cvx21-bradley.dialup.earthlink.net ([209.179.193.226] helo=mindspring.com) by heron with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18YJgJ-0005tE-00; Mon, 13 Jan 2003 21:32:12 -0800 Message-ID: <3E23A086.FC511354@mindspring.com> Date: Mon, 13 Jan 2003 21:30:46 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: "Alan L. Cox" , Peter Wemm , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> <200301140411.h0E4BgpN078032@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4eff00ac4d4cdc9d02d2897bf21b0334ea8438e0f32a48e08350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > > This is a first-attempt workup of getsysfd(). See? I told ya it was > trivial! [ ... ] What does this uniquely do, which can be done no other way, again? -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 0:51:15 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CAA8F37B401 for ; Tue, 14 Jan 2003 00:51:13 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6867643ED8 for ; Tue, 14 Jan 2003 00:51:13 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0E8p70i078883; Tue, 14 Jan 2003 00:51:08 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0E8p78U078882; Tue, 14 Jan 2003 00:51:07 -0800 (PST) Date: Tue, 14 Jan 2003 00:51:07 -0800 (PST) From: Matthew Dillon Message-Id: <200301140851.h0E8p78U078882@apollo.backplane.com> To: Terry Lambert Cc: "Alan L. Cox" , Peter Wemm , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> <200301140411.h0E4BgpN078032@apollo.backplane.com> <3E23A086.FC511354@mindspring.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG : :Matthew Dillon wrote: :> :> This is a first-attempt workup of getsysfd(). See? I told ya it was :> trivial! : :[ ... ] : :What does this uniquely do, which can be done no other way, again? : :-- Terry What Peter asked for (and what this does) is give you a descriptor that is associated with just a VM Object. You can then mmap() the descriptor, pass it to other processes and they can mmap() it too. It's a way of doing swap-backed shared memory without having to deal with the filesystem. The only other solutions are: * SysV shared memory, which is not fun to manage and may have weird size limitations, and has a race condition where a process dying at just the wrong time may leave a shmem segment lying around in the kernel. * A file, which uses the filesystem as backing store. Even with MAP_NOSYNC a shared file still uses the filesystem as backing store. This is typically not what is desired. Another thing I would like to do is descriptor-based timers. So instead of being limited to just the stupid itimers, or interfering with other threads/libraries use of [i]timers, you can simply allocate your own by getting a timer descriptor and then doing cool things with it, like having it generate a custom signal or selecting on it or kqueue'ing on it etc... it's something UNIX has needed for a long time actually. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 1:47:52 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3BCBD37B401 for ; Tue, 14 Jan 2003 01:47:51 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0C75D43F5B for ; Tue, 14 Jan 2003 01:47:49 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h0E9lZQB095344; Tue, 14 Jan 2003 12:47:35 +0300 (MSK) Date: Tue, 14 Jan 2003 12:47:35 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: Matthew Dillon Cc: arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <200301140851.h0E8p78U078882@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 14 Jan 2003, Matthew Dillon wrote: > Another thing I would like to do is descriptor-based timers. So instead > of being limited to just the stupid itimers, or interfering with other > threads/libraries use of [i]timers, you can simply allocate your own by > getting a timer descriptor and then doing cool things with it, like > having it generate a custom signal or selecting on it or kqueue'ing on > it etc... it's something UNIX has needed for a long time actually. kqueue already has EVFILT_TIMER in __FreeBSD_version >= 440001 and __FreeBSD_version >= 500023. Descriptor-based timers would be non-standard feature and if you use non-standard features then you should use kqueue instead of poll or select. Nevetheless it seems to me that using many kernel timers is not good thing if you need frequently to set or delete them (i.e. in web-servers). It's much better to use user-level timer queue and call kqueue/poll/select with timeout value from the head of this queue. Igor Sysoev http://sysoev.ru/en/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 2: 0:23 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8348837B405 for ; Tue, 14 Jan 2003 02:00:22 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.BAYAREA.NET [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id B130943F43 for ; Tue, 14 Jan 2003 02:00:21 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.6/8.12.6) with ESMTP id h0EA0F2G046464; Tue, 14 Jan 2003 02:00:15 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.6/8.12.6) with ESMTP id h0EA0NmY017860; Tue, 14 Jan 2003 02:00:23 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.6/8.12.6/Submit) id h0EA0Nfc017859; Tue, 14 Jan 2003 02:00:23 -0800 (PST) (envelope-from marcel) Date: Tue, 14 Jan 2003 02:00:22 -0800 From: Marcel Moolenaar To: "Alan L. Cox" Cc: Peter Wemm , Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Virtual memory question Message-ID: <20030114100022.GA17799@dhcp01.pn.xcllnt.net> References: <20030114041407.DDFC32A89E@canning.wemm.org> <3E239150.9FC363DD@imimic.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E239150.9FC363DD@imimic.com> User-Agent: Mutt/1.5.1i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Jan 13, 2003 at 10:25:52PM -0600, Alan L. Cox wrote: > > >From the spec (on the web page that I mentioned): > > "The name argument points to a string naming a shared memory object. It > is unspecified whether the name appears in the file system and is > visible to other functions that take pathnames as arguments." > > I don't think ftruncate() is necessary. The underlying shm object can > be grown implicitly according to its mmap()ings. I do not, however, > know of a way to shrink an shm object. shm_open("key", O_TRUNC|..., ...) ? -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 3: 0:39 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 639B437B401 for ; Tue, 14 Jan 2003 03:00:38 -0800 (PST) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id E6CAF43F18 for ; Tue, 14 Jan 2003 03:00:37 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0016.cvx40-bradley.dialup.earthlink.net ([216.244.42.16] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18YOo3-0004tb-00; Tue, 14 Jan 2003 03:00:32 -0800 Message-ID: <3E23ED80.5C5832BC@mindspring.com> Date: Tue, 14 Jan 2003 02:59:12 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Igor Sysoev Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a42230319ac6a1255e509fc33c97650005667c3043c0873f7e350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Igor Sysoev wrote: > kqueue already has EVFILT_TIMER in __FreeBSD_version >= 440001 and > __FreeBSD_version >= 500023. Descriptor-based timers would be non-standard > feature and if you use non-standard features then you should use kqueue > instead of poll or select. Most of the blue-sky ideas in this thread are already implemented, like the kevent's that are sent up after registering an interest in file deletion/creation or directory deletion/creation on an fd open to a particular directory, without needing to add yet another interface to support it... > Nevetheless it seems to me that using many kernel timers is not good > thing if you need frequently to set or delete them (i.e. in web-servers). > It's much better to use user-level timer queue and call > kqueue/poll/select with timeout value from the head of this queue. This is actually a bogus argument against it. It turns out that for timers, they work better if they are cancelled before they ever fire, because cancellation is by reference, whereas firing is by traversal. The current implmentation of timers is actually not very good, in general, for a large amount of timers, because the lists in the callout wheel are not sorted, so that a very long list only needs to be traversed until the first non-expired event (instead, each slot in the callout wheel has to have all its entries traversed, to see if they are expired). We were much better off with fixed interval timers, back in BSD 4.2 and 4.3, and the change to a callout wheel is a recent thing (there are actually some idiots who believe they have invented the idea of fixed interval timers to get around this problem, when there's actually over 22 years of prior art). In any case, it's much more bogus to argue against timers that never fire, than it is to argue against timers that do fire -- the 2MSL timers used everywhere in the TCP stack are actually timers that, in the common case, never actually fire. If you want to argue against non-firing timers, you'd need to revert the change to the TCP stack that moved it to the callout wheel based timers, back in the mid/late 1990's. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 3:10:23 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A3EF237B401 for ; Tue, 14 Jan 2003 03:10:21 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1414743F13 for ; Tue, 14 Jan 2003 03:10:21 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0EBAK0i079776; Tue, 14 Jan 2003 03:10:20 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0EBAKT0079775; Tue, 14 Jan 2003 03:10:20 -0800 (PST) Date: Tue, 14 Jan 2003 03:10:20 -0800 (PST) From: Matthew Dillon Message-Id: <200301141110.h0EBAKT0079775@apollo.backplane.com> To: Igor Sysoev Cc: arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :kqueue already has EVFILT_TIMER in __FreeBSD_version >= 440001 and :__FreeBSD_version >= 500023. Descriptor-based timers would be non-standard :feature and if you use non-standard features then you should use kqueue :instead of poll or select. : :Nevetheless it seems to me that using many kernel timers is not good :thing if you need frequently to set or delete them (i.e. in web-servers). :It's much better to use user-level timer queue and call :kqueue/poll/select with timeout value from the head of this queue. : : :Igor Sysoev :http://sysoev.ru/en/ There's really nothing wrong with a large number of kernel supported timers. One of my telemetry systems probably has a thousand kernel supported timers operating on a 20MHz 68000. i.e. not an issue if done right. That said, a per-descriptor timer implementation would have no more or less overhead then the kqueue implementation. The software designer isn't being forced to use one descriptor based timer for each soft timer he wants, after all. The kqueue timer is rather ad-hoc. It's not nearly sophisticated enough. The absolute minimum timer and timing support I throw into my embedded systems is: * seconds, 1/10 seconds, realtime_seconds, ticks, fine-grained-ticks (typically the best hardware resolution available). (I'd also add microseconds for a UNIX implementation) * one-shot, periodic * on-the-fly adjust of current count (forwards or backwards), double-buffer (set reload value without effecting current countdown), start, stop reset, reload-absolute, reload-relative. * Software interrupt (could be thought of as an upcall I suppose) on completion, or signal, or some other completion mechanism (kqueue). When I'm talking about timers, I mean the real deal. Not the pipsqueak little timers implemented by kqueue. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 3:30:20 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1E41437B405 for ; Tue, 14 Jan 2003 03:30:19 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6883343E4A for ; Tue, 14 Jan 2003 03:30:17 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h0EBUFQB002084; Tue, 14 Jan 2003 14:30:15 +0300 (MSK) Date: Tue, 14 Jan 2003 14:30:15 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: Terry Lambert Cc: arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <3E23ED80.5C5832BC@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 14 Jan 2003, Terry Lambert wrote: > Igor Sysoev wrote: > > Nevetheless it seems to me that using many kernel timers is not good > > thing if you need frequently to set or delete them (i.e. in web-servers). > > It's much better to use user-level timer queue and call > > kqueue/poll/select with timeout value from the head of this queue. > > This is actually a bogus argument against it. It turns out that I do not want to say that kernel-based timers are useless or too expensive. I only want to say that if application need to set and delete timers too often then it's much better to set and delete them in user space because most of them will never fired and calling kernel to set or cancel them is expensive. > for timers, they work better if they are cancelled before they > ever fire, because cancellation is by reference, whereas firing > is by traversal. In server that I'm deleloping I use delta value between timeouts so firing is cheap as cancelation. Firing timers are on the head of queue. > In any case, it's much more bogus to argue against timers that > never fire, than it is to argue against timers that do fire -- > the 2MSL timers used everywhere in the TCP stack are actually > timers that, in the common case, never actually fire. If you > want to argue against non-firing timers, you'd need to revert > the change to the TCP stack that moved it to the callout wheel > based timers, back in the mid/late 1990's. I did not argue against previous, current or future implementation of timers in the kernel. I mean that if user-level application need to set thousands timers (i.e. web server with thousands connections) it's better to manage them in user space and set only one the most early timer in the kernel via kqueue/select/poll. Igor Sysoev http://sysoev.ru/en/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 3:42: 0 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DC6E637B401 for ; Tue, 14 Jan 2003 03:41:57 -0800 (PST) Received: from flavatown.mail.pas.earthlink.net (flavatown.mail.pas.earthlink.net [207.217.120.148]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4086643F79 for ; Tue, 14 Jan 2003 03:41:57 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from stork (stork.mail.pas.earthlink.net [207.217.120.188]) by flavatown.mail.pas.earthlink.net (8.11.6+Sun/8.11.6) with ESMTP id h0EBXUV08049; Tue, 14 Jan 2003 03:33:30 -0800 (PST) Received: from pool0016.cvx40-bradley.dialup.earthlink.net ([216.244.42.16] helo=mindspring.com) by stork with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18YPJf-0005ZQ-00; Tue, 14 Jan 2003 03:33:12 -0800 Message-ID: <3E23F51D.4AD21462@mindspring.com> Date: Tue, 14 Jan 2003 03:31:41 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: Igor Sysoev , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: <200301141110.h0EBAKT0079775@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a46552e1c35a7293c201adaf08822528c6666fa475841a1c7a350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > There's really nothing wrong with a large number of kernel supported > timers. One of my telemetry systems probably has a thousand > kernel supported timers operating on a 20MHz 68000. i.e. not an > issue if done right. Actually, the callout queue is a significant amount of overhead, when you mode to a large number of TCP/IP connections (e.g. 50,000; you don't even need to go to the 1.6 million I pushed it to, at one point). The problem is that, even with a very large callout wheel, the number of elements in each bucket's list becomes prohibitively large. If you start adding a bunch of other elements into buckets, then you exacerbate the problem. The argument against it on the basis of timers that get deleted before they fire is bogus though (see previous argument). > That said, a per-descriptor timer implementation would have no more or > less overhead then the kqueue implementation. The software designer > isn't being forced to use one descriptor based timer for each soft > timer he wants, after all. This is a bogus argument, too: why implement them, if you do not expect them to be used? The idea of timer descriptors is somewhat bogus; you aren't going to do read's or writes on them, so getitimer is just as good as open, for getting a handle to refer to the things. 8-). > The kqueue timer is rather ad-hoc. It's not nearly sophisticated > enough. Yes. The single timer timeout facility, ala select(2), sucks... ala select(2). It was never really intended to be a heavy-duty implementation, though you can use it to build one, as long as you don't care about strict intervals. Linux is better about this, with their bogofied select(2) that does what the man page has been threatening it might do, for a long time, which is to adjust the remainder in the timer. In any case, you are still skewed by the processing interval of whatever intermediate task you happen to do as a result of an event that cause the select/kqueue to come true instead of the timer firing. So real interval timers are useful. I'm just not sure *how* useful, past getitimer/setitimer. > The absolute minimum timer and timing support I throw into > my embedded systems is: > > * seconds, 1/10 seconds, realtime_seconds, ticks, fine-grained-ticks > (typically the best hardware resolution available). (I'd also > add microseconds for a UNIX implementation) On FreeBSD, this will never be better than "some time after the minimum interval has expired", because FreeBSD is not a hard RT system. Some people argue that you can't implement a hard RT system on PC hardware, but the people who argue that are mostly idiots (i.e. my granularity requirements might be 1 second, which is easy to achieve on a 2.8 GHz CPU, don't you think?). > * one-shot, periodic Natch. Supporting that's one of the problems with itimers. > * on-the-fly adjust of current count (forwards or backwards), > double-buffer (set reload value without effecting current > countdown), start, stop reset, reload-absolute, reload-relative. Don't know how atomically this can be done... > * Software interrupt (could be thought of as an upcall I suppose) > on completion, or signal, or some other completion mechanism > (kqueue). Signals are stupid, but setitimer supports them. Kqueue support for pretty much anything is trivial. But it doesn't need to be an fd for that, only a handle on which kevent's can be registered, so long as the handle is unique in the registration domain (e.g. my patches that added support for System V message queues). > When I'm talking about timers, I mean the real deal. Not the pipsqueak > little timers implemented by kqueue. You mean implemented as the timeout parameter in kqueue; there's no reason that the kqueue can't be used as the event communication mechanism for itimers (for example). I believe that supporting one-shots would be within the realm of the standard, in terms of implementation defined flags on itimers... -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 3:47:58 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D8DF037B405 for ; Tue, 14 Jan 2003 03:47:56 -0800 (PST) Received: from soulshock.mail.pas.earthlink.net (soulshock.mail.pas.earthlink.net [207.217.120.130]) by mx1.FreeBSD.org (Postfix) with ESMTP id BDE3A43F43 for ; Tue, 14 Jan 2003 03:47:55 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from stork (stork.mail.pas.earthlink.net [207.217.120.188]) by soulshock.mail.pas.earthlink.net (8.11.6+Sun/8.11.6) with ESMTP id h0EBiWH18470 for ; Tue, 14 Jan 2003 03:44:32 -0800 (PST) Received: from pool0016.cvx40-bradley.dialup.earthlink.net ([216.244.42.16] helo=mindspring.com) by stork with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18YPUR-0006Ry-00; Tue, 14 Jan 2003 03:44:20 -0800 Message-ID: <3E23F7B9.3FA2B37@mindspring.com> Date: Tue, 14 Jan 2003 03:42:49 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Igor Sysoev Cc: arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a449560a48fbfb5e8aeb154a94ba214e6f667c3043c0873f7e350badd9bab72f9c350badd9bab72f9c Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Igor Sysoev wrote: > I do not want to say that kernel-based timers are useless or too expensive. > > I only want to say that if application need to set and delete timers > too often then it's much better to set and delete them in user space > because most of them will never fired and calling kernel to set or cancel > them is expensive. I've personally implemented a lot -- scores -- of call-conversion schedulers in my career, including timer support. The points that Matt makes about the kqueue/select style timeout as a building block are really very valid, if you need closed to fixed intervals as possible, at least within the non-RT scheduling constraints forced on you if you are using FreeBSD (the main reason I never joined John Dyson in his "rewrite the FreeBSD kernel" crusade was that he had no intention of dealing with RT issues, only SMP issues). > > for timers, they work better if they are cancelled before they > > ever fire, because cancellation is by reference, whereas firing > > is by traversal. > > In server that I'm deleloping I use delta value between timeouts > so firing is cheap as cancelation. Firing timers are on the head > of queue. This only works for fixed intervals timers, or variable interval timers with very high insertion costs -- effectively, you must implement in terms of absolute monotonically increasing tick count, and then traverse the outstanding list to perform an order insertion. The best case you can get is hashed vector insertion, where you must hit the hash and then traverse to the end of the list or the next hash bucket... effectively, a skiplist. This works OK for timers that are inserted, and left there, but it sucks for one-shots, it sucks for timers that are going to be deleted (you pay your cost up front), and it sucks for repeating timers (you have to pay the insertion cost each time). So basically, it doesn't answer two of the three types of timers that Matt desribed. Note that just going ahead and implementing this way sucks, too. Better to implement fixed interval event list heads that were inserted into a btree or a TST for the first event, and then tail-inserted for each timer request, so you can traverse from the head and get a 100% hit rate, minus 1, for an arbitrary number of timer events. Matt's suggested facilities have this same overhead problem, FWIW. > I did not argue against previous, current or future implementation > of timers in the kernel. I mean that if user-level application > need to set thousands timers (i.e. web server with thousands > connections) it's better to manage them in user space and set > only one the most early timer in the kernel via kqueue/select/poll. Batch them agross kqueue calls, and amortize the costs. Pushing down 100 timers with 1 call costs only 1/100th of a sytem call per timer. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 6:24:53 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1163837B401 for ; Tue, 14 Jan 2003 06:24:51 -0800 (PST) Received: from web41215.mail.yahoo.com (web41215.mail.yahoo.com [66.218.93.48]) by mx1.FreeBSD.org (Postfix) with SMTP id 80ECB43F75 for ; Tue, 14 Jan 2003 06:24:50 -0800 (PST) (envelope-from gathorpe79@yahoo.com) Message-ID: <20030114142450.57943.qmail@web41215.mail.yahoo.com> Received: from [24.114.70.137] by web41215.mail.yahoo.com via HTTP; Tue, 14 Jan 2003 09:24:50 EST Date: Tue, 14 Jan 2003 09:24:50 -0500 (EST) From: Gary Thorpe Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) To: Matthew Dillon , freebsd-arch@freebsd.org In-Reply-To: <200301140851.h0E8p78U078882@apollo.backplane.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --- Matthew Dillon wrote: > > : > :Matthew Dillon wrote: > :> > :> This is a first-attempt workup of getsysfd(). See? I told ya > it was > :> trivial! > : > :[ ... ] > : > :What does this uniquely do, which can be done no other way, again? > : > :-- Terry > > What Peter asked for (and what this does) is give you a > descriptor > that is associated with just a VM Object. You can then mmap() > the > descriptor, pass it to other processes and they can mmap() it > too. > > It's a way of doing swap-backed shared memory without having to > deal > with the filesystem. The only other solutions are: > > * SysV shared memory, which is not fun to manage and may have > weird > size limitations, and has a race condition where a process > dying > at just the wrong time may leave a shmem segment lying around > in > the kernel. A quick note: I don't think its a "race condition" necessarily, but it is purposefully designed that way: processes have to remove shared memory segments themselves and they may die before doing so. If your program exits normally after creating a SysV shared memory segment without removing it, it will stay around (I suppose because it is globally accessable by processes having the right key). Perhaps a sort of garbage collection scheme for it would be useful (i.e. if reference count reaches zero [all the mapping processes have exited], delete it), but then suppose you want data to persists in it? Same for SysV semaphores and message boxes I think. > > * A file, which uses the filesystem as backing store. Even with > MAP_NOSYNC a shared file still uses the filesystem as backing > store. > This is typically not what is desired. The third solution is anonymous mappings via mmap(), but I only think that can be shared by parent and children after fork(). Is there a way to share this with non-related (in terms of fork() hierarchy) processes? > > > Another thing I would like to do is descriptor-based timers. So > instead > of being limited to just the stupid itimers, or interfering with > other > threads/libraries use of [i]timers, you can simply allocate your > own by > getting a timer descriptor and then doing cool things with it, > like > having it generate a custom signal or selecting on it or > kqueue'ing on > it etc... it's something UNIX has needed for a long time > actually. > > -Matt > Matthew Dillon > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message ______________________________________________________________________ Post your free ad now! http://personals.yahoo.ca To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 10:34:44 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1264D37B401 for ; Tue, 14 Jan 2003 10:34:43 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 919B843F65 for ; Tue, 14 Jan 2003 10:34:42 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0EIYg0i081142; Tue, 14 Jan 2003 10:34:42 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0EIYgD4081141; Tue, 14 Jan 2003 10:34:42 -0800 (PST) Date: Tue, 14 Jan 2003 10:34:42 -0800 (PST) From: Matthew Dillon Message-Id: <200301141834.h0EIYgD4081141@apollo.backplane.com> To: Gary Thorpe Cc: freebsd-arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: <20030114142450.57943.qmail@web41215.mail.yahoo.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :A quick note: I don't think its a "race condition" necessarily, but it :is purposefully designed that way: processes have to remove shared :memory segments themselves and they may die before doing so. If your :program exits normally after creating a SysV shared memory segment :without removing it, it will stay around (I suppose because it is :globally accessable by processes having the right key). Perhaps a sort :of garbage collection scheme for it would be useful (i.e. if reference :count reaches zero [all the mapping processes have exited], delete it), :but then suppose you want data to persists in it? Same for SysV :semaphores and message boxes I think. The most common sysv shared memory operation is to create a private shared memory segment for a program while it is running. You normally do this by creating the segment, mapping it, and then removing the segment. Deleting the segment does not removing the mapping, it simply causes the kernel to physically remove the segment once the last reference to it has gone away. There is a race during this create/map/delete process. If the process is killed after the create but before it manages to delete the mapping, the mapping remains in the shm tables (ipcs -a). A far better solution to this common scenario is to have a filesystem rendezvous, like a FIFO (see mkfifo), where the kernel simply blows everything away after the last reference has disappeared, or to have a descriptor-based rendezvous which has the same effect when the descriptor is closed (and all mappings have gone away) for the last time. :The third solution is anonymous mappings via mmap(), but I only think :that can be shared by parent and children after fork(). Is there a way :to share this with non-related (in terms of fork() hierarchy) :processes? No. Well, there might be a trick that could be played with exec(), but basically no. Even depending on fork() is not reliable. FreeBSD supports shared memory on fork(), but it isn't guarenteed across all available unix architectures (i.e. on some UNIX systems you can't have MAP_SHARED memory across a fork and the space becomes copy-on-write). -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 11:31:45 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6593C37B406 for ; Tue, 14 Jan 2003 11:31:44 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 926CD43E4A for ; Tue, 14 Jan 2003 11:31:42 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h0EJVZQB023811; Tue, 14 Jan 2003 22:31:35 +0300 (MSK) Date: Tue, 14 Jan 2003 22:31:34 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: Terry Lambert Cc: arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <3E23F51D.4AD21462@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 14 Jan 2003, Terry Lambert wrote: > Matthew Dillon wrote: > > When I'm talking about timers, I mean the real deal. Not the pipsqueak > > little timers implemented by kqueue. > > You mean implemented as the timeout parameter in kqueue; there's > no reason that the kqueue can't be used as the event communication > mechanism for itimers (for example). I believe that supporting > one-shots would be within the realm of the standard, in terms of > implementation defined flags on itimers... I think Matt means not kqueue timeout parameter but EVFILT_TIMER filter that has millisecond resolution and allows periodic (by default) or oneshot (with EV_ONESHOT flag) timers. Igor Sysoev http://sysoev.ru/en/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 15:17:40 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id ABBF737B401 for ; Tue, 14 Jan 2003 15:17:37 -0800 (PST) Received: from canning.wemm.org (canning.wemm.org [192.203.228.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0816E43F18 for ; Tue, 14 Jan 2003 15:17:37 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by canning.wemm.org (Postfix) with ESMTP id DF9442A89E; Tue, 14 Jan 2003 15:17:36 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Matthew Dillon Cc: Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <200301140851.h0E8p78U078882@apollo.backplane.com> Date: Tue, 14 Jan 2003 15:17:36 -0800 From: Peter Wemm Message-Id: <20030114231736.DF9442A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > > : > :Matthew Dillon wrote: > :> > :> This is a first-attempt workup of getsysfd(). See? I told ya it was > :> trivial! > : > :[ ... ] > : > :What does this uniquely do, which can be done no other way, again? > : > :-- Terry > > What Peter asked for (and what this does) is give you a descriptor > that is associated with just a VM Object. You can then mmap() the > descriptor, pass it to other processes and they can mmap() it too. > > It's a way of doing swap-backed shared memory without having to deal > with the filesystem. The only other solutions are: > > * SysV shared memory, which is not fun to manage and may have weird > size limitations, and has a race condition where a process dying > at just the wrong time may leave a shmem segment lying around in > the kernel. > > * A file, which uses the filesystem as backing store. Even with > MAP_NOSYNC a shared file still uses the filesystem as backing store. > This is typically not what is desired. Also, it gives you a handle to hold data while temporarily unmapped. eg: you can implement a small movable mapped window into a larger object. With MAP_ANON and /dev/zero, when you unmap the pages they are gone. Also, we could use one of these beasties as a backing store for malloc(). Since the offset is persistent and has a sequence of page offsets it should avoid the map fragmentation. > Another thing I would like to do is descriptor-based timers. So instead > of being limited to just the stupid itimers, or interfering with other > threads/libraries use of [i]timers, you can simply allocate your own by > getting a timer descriptor and then doing cool things with it, like > having it generate a custom signal or selecting on it or kqueue'ing on > it etc... it's something UNIX has needed for a long time actually. We use kqueue timers at work FWIW. Trying to use the system timers in some cases is painful, and the kqueue EVFILT_TIMER stuff (both on periodic and oneshot mode) is just damn convenient. Sure, it doesn't have the bells and whistles that you have commented out in sysfd.h, but what kqueue provides is exactly what we need. Allthough the features could be added to the already existing kqueue interface, I dont see much use for them - sys and virtual timebases are rather specialized and somewhat overkill for regular applications IMHO. Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 19:16:21 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E648537B401 for ; Tue, 14 Jan 2003 19:16:19 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6E64543F13 for ; Tue, 14 Jan 2003 19:16:19 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0F3GJ0i005443; Tue, 14 Jan 2003 19:16:19 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0F3GIe8005442; Tue, 14 Jan 2003 19:16:18 -0800 (PST) Date: Tue, 14 Jan 2003 19:16:18 -0800 (PST) From: Matthew Dillon Message-Id: <200301150316.h0F3GIe8005442@apollo.backplane.com> To: Peter Wemm Cc: Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: <20030114231736.DF9442A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :Also, it gives you a handle to hold data while temporarily unmapped. eg: :you can implement a small movable mapped window into a larger object. With :MAP_ANON and /dev/zero, when you unmap the pages they are gone. : :Also, we could use one of these beasties as a backing store for malloc(). :Since the offset is persistent and has a sequence of page offsets it should :avoid the map fragmentation. That is a very interesting idea. fork() would be a problem though. Still, I see a lot of possible uses for this sort of thing. Communication between kernel and userland could use a VM Object like this... consider kqueue and AIO operation that does not require copying to and from userspace. Instead you implement a message queue with a shared VM object and tell the kernel to go. This would allow a bunch of I/O and/or kqueue requests to be collected together and then initiated with a single system call, and events could be reported back on a different VM Object. (just brainstorming). :> Another thing I would like to do is descriptor-based timers. So instead :> of being limited to just the stupid itimers, or interfering with other :> threads/libraries use of [i]timers, you can simply allocate your own by :> getting a timer descriptor and then doing cool things with it, like :> having it generate a custom signal or selecting on it or kqueue'ing on :> it etc... it's something UNIX has needed for a long time actually. : :We use kqueue timers at work FWIW. Trying to use the system timers in some :cases is painful, and the kqueue EVFILT_TIMER stuff (both on periodic and :oneshot mode) is just damn convenient. Sure, it doesn't have the bells and :whistles that you have commented out in sysfd.h, but what kqueue provides :is exactly what we need. Allthough the features could be added to the :already existing kqueue interface, I dont see much use for them - sys :and virtual timebases are rather specialized and somewhat overkill for :regular applications IMHO. : :Cheers, :-Peter :-- :Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com Oh, don't get me wrong... the kqueue timers definitely fill a need. They aren't complete, though. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 19:42:36 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0D47137B401 for ; Tue, 14 Jan 2003 19:42:35 -0800 (PST) Received: from canning.wemm.org (canning.wemm.org [192.203.228.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id BE93B43F43 for ; Tue, 14 Jan 2003 19:42:34 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by canning.wemm.org (Postfix) with ESMTP id A3A742A89E; Tue, 14 Jan 2003 19:42:34 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Matthew Dillon Cc: Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <200301150316.h0F3GIe8005442@apollo.backplane.com> Date: Tue, 14 Jan 2003 19:42:34 -0800 From: Peter Wemm Message-Id: <20030115034234.A3A742A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > > :Also, it gives you a handle to hold data while temporarily unmapped. eg: > :you can implement a small movable mapped window into a larger object. With > :MAP_ANON and /dev/zero, when you unmap the pages they are gone. > : > :Also, we could use one of these beasties as a backing store for malloc(). > :Since the offset is persistent and has a sequence of page offsets it should > :avoid the map fragmentation. > > That is a very interesting idea. fork() would be a problem though. Yes, oops. :-) Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 14 20:34: 7 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E7E437B405 for ; Tue, 14 Jan 2003 20:34:06 -0800 (PST) Received: from philotas.hosting.swbell.net (philotas.hosting.swbell.net [216.100.99.7]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4E27243EB2 for ; Tue, 14 Jan 2003 20:34:05 -0800 (PST) (envelope-from alc@imimic.com) Received: from imimic.com (adsl-216-63-78-18.dsl.hstntx.swbell.net [216.63.78.18]) by philotas.hosting.swbell.net id XAA06409; Tue, 14 Jan 2003 23:33:51 -0500 (EST) [ConcentricHost SMTP Relay 1.14] Message-ID: <3E24E4AB.5B50EBD6@imimic.com> Date: Tue, 14 Jan 2003 22:33:47 -0600 From: "Alan L. Cox" Organization: iMimic Networking, Inc. X-Mailer: Mozilla 4.8 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Marcel Moolenaar Cc: Peter Wemm , Matthew Dillon , arch@FreeBSD.ORG Subject: Re: Virtual memory question References: <20030114041407.DDFC32A89E@canning.wemm.org> <3E239150.9FC363DD@imimic.com> <20030114100022.GA17799@dhcp01.pn.xcllnt.net> Content-Type: text/plain; charset=x-user-defined Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Marcel Moolenaar wrote: > > On Mon, Jan 13, 2003 at 10:25:52PM -0600, Alan L. Cox wrote: > > > > >From the spec (on the web page that I mentioned): > > > > "The name argument points to a string naming a shared memory object. It > > is unspecified whether the name appears in the file system and is > > visible to other functions that take pathnames as arguments." > > > > I don't think ftruncate() is necessary. The underlying shm object can > > be grown implicitly according to its mmap()ings. I do not, however, > > know of a way to shrink an shm object. > > shm_open("key", O_TRUNC|..., ...) ? > Yes, that is defined to reset the object to zero length. I was thinking about something that would only eliminate a portion of the object. Alan To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 2: 6:29 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C8BF337B401 for ; Wed, 15 Jan 2003 02:06:28 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 182CB43F13 for ; Wed, 15 Jan 2003 02:06:27 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h0FA6FQB053434; Wed, 15 Jan 2003 13:06:15 +0300 (MSK) Date: Wed, 15 Jan 2003 13:06:15 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: Matthew Dillon Cc: arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <200301150316.h0F3GIe8005442@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 14 Jan 2003, Matthew Dillon wrote: > Communication between kernel and userland could use a VM Object like > this... consider kqueue and AIO operation that does not require copying > to and from userspace. Instead you implement a message queue with a > shared VM object and tell the kernel to go. This would allow a bunch of > I/O and/or kqueue requests to be collected together and then initiated > with a single system call, and events could be reported back on a > different VM Object. > > (just brainstorming). Last Linux kernels introduced epoll (small subset of kqueue with EV_CLEAR flag): http://www.xmailserver.org/linux-patches/nio-improve.html It uses mmap()ed area to get event array. Igor Sysoev http://sysoev.ru/en/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 3:12:32 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 10AFE37B401 for ; Wed, 15 Jan 2003 03:12:30 -0800 (PST) Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by mx1.FreeBSD.org (Postfix) with ESMTP id D8C8B43F65 for ; Wed, 15 Jan 2003 03:12:28 -0800 (PST) (envelope-from des@ofug.org) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 893BE5374; Wed, 15 Jan 2003 12:12:25 +0100 (CET) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: arch@freebsd.org Subject: Buglet in disklabel From: Dag-Erling Smorgrav Date: Wed, 15 Jan 2003 12:12:24 +0100 Message-ID: Lines: 45 User-Agent: Gnus/5.090007 (Oort Gnus v0.07) Emacs/21.2 (i386--freebsd) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --=-=-= There's a bug in disklabel -e which manifests itself when you make a mistake and re-edit the label. Under certain circumstances, it will remember portions of an earlier version of the label, and reject a valid label because of conflicts with the earlier version. To demonstrate this problem, run 'disklabel -e' on a scratch disk and do the following: - add a 'd' partition with size * and an offset well within the size of the disk - add an 'e' partition with size * and a different offset well within the size of the disk. - save and exit; disklabel will complain: Warning, Too many '*' partitions (d and e) Warning, partition e: size 0, but offset 7843184 partitions d and e overlap! re-edit the label? [y]: - agree to re-edit the label - remove the 'e' partition - save and exit; disklabel will complain again: Warning, Too many '*' partitions (d and e) This happens because it remembers the 'e' partition from the previous round, even though you deleted it. In this case, nothing truly bad happens, but it is conceivable that it will actually write a label to disk which differs from the one you specified (if for instance you change the * in the specification for the 'd' partition to a fixed value which does not cause an overlap with the ghost 'e' partition) I believe the attached patch fixes this problem, but I'd like a review from someone more familiar with disklabel before I commit it. DES -- Dag-Erling Smorgrav - des@ofug.org --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=disklabel.diff Index: disklabel.c =================================================================== RCS file: /home/ncvs/src/sbin/disklabel/disklabel.c,v retrieving revision 1.65 diff -u -r1.65 disklabel.c --- disklabel.c 4 Jan 2003 08:50:47 -0000 1.65 +++ disklabel.c 15 Jan 2003 10:57:56 -0000 @@ -957,6 +957,7 @@ int lineno = 0, errors = 0; int i; + bzero(lp, sizeof *lp); lp->d_bbsize = BBSIZE; /* XXX */ lp->d_sbsize = 0; /* XXX */ while (fgets(line, sizeof(line) - 1, f)) { --=-=-=-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 4:38: 0 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F004A37B405 for ; Wed, 15 Jan 2003 04:37:57 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id B37B543F81 for ; Wed, 15 Jan 2003 04:37:55 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0FCbX9O014716 for ; Wed, 15 Jan 2003 13:37:33 +0100 (CET) (envelope-from phk@freebsd.org) To: arch@freebsd.org Subject: HEADSUP: DEVFS and GEOM mandatorification timeline. From: Poul-Henning Kamp Date: Wed, 15 Jan 2003 13:37:33 +0100 Message-ID: <14715.1042634253@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I think we are now at a cross-road where it makes sense to nail down the path for making DEVFS and GEOM non-optional components. The three steps in this process will be: Remove option "NODEVFS" in sys/conf and sys/*/conf. Remove option "NO_GEOM" in sys/conf and sys/*/conf. Remove MAKEDEV and all references to it from the tree. There are _no_ sweeps over all drivers to change APIs etc. I plan to do this when 5.0-RELEASE has been out there for some weeks and I am convinced that there are no show-stopping bugs, we are probably talking march 1st or thereabout. This is simply a matter drawing a line, after which DEVFS and GEOM are standard parts and driver writers no longer will have to deal with having two different "modes" for FreeBSD device managment. Subsequently, I will start to harvest the benefits that has been sown with DEVFS and GEOM, and remove the unneeded "old-style" code, but this will be without any user-impact and neutral relative to the 5-stable branch. By removing these options before 5-stable, we make life easier for the device driver writers for the whole 5-stable cycle (2-3 years) because they will not have to deal with dual-mode APIs in their drivers. We are already at a point where very very few device drivers are tested with NODEVFS, we have a couple of platforms which only have support for their disk layouts in GEOM, we have I/O subsystems which do not work without DEVFS at all: Turning DEVFS or GEOM off at this point in time is not very viable. Keeping the negative options around for another entire relase-cycle will not gain us anything: People already do not test with those options and the code to implement them will just rot away. If we do not do this before the 5-stable branch, we will put us in a situation where a lot of the changes we will be doing to central areas like the buf/VM system will not be possible to MFC from 6-current to 5-stable because of the requirement that non-devfs and non-geom operations be maintained working in that branch. This in turns means that fixing bugs in these areas in 5-stable will be up-hill work for the developers involved and we will likely see a reduced branch lifetime like we saw it on 3.x. Finally, I would like to request that anybody who intend to argue that these options should survive for 5-stable try to run their system without DEVFS and GEOM for a week, just to make sure they know what they are talking about. Poul-Henning PS: I know that Bruce is against GEOM and DEVFS on principal grounds. All the central developers I have talked to agree that this is the direction we are going. Unless I am Terrybly mistaken, only the speed of adoption is up for discussion at this point, the direction is not. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 5:34: 6 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5FDCF37B401 for ; Wed, 15 Jan 2003 05:34:05 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6310443F5F for ; Wed, 15 Jan 2003 05:34:04 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id AAA05496; Thu, 16 Jan 2003 00:33:56 +1100 Date: Thu, 16 Jan 2003 00:35:00 +1100 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Dag-Erling Smorgrav Cc: arch@FreeBSD.ORG Subject: Re: Buglet in disklabel In-Reply-To: Message-ID: <20030116003206.C23422-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 15 Jan 2003, Dag-Erling Smorgrav wrote: > There's a bug in disklabel -e which manifests itself when you make a > mistake and re-edit the label. Under certain circumstances, it will > remember portions of an earlier version of the label, and reject a > valid label because of conflicts with the earlier version. > ... > I believe the attached patch fixes this problem, but I'd like a review > from someone more familiar with disklabel before I commit it. [Patch lost to attachment.] Looks good. You could also remove the bzero() before one of the calls to getasciilabel() (the one that passes a local variable that would more obviously be unitialized if it weren't bzero()ed). Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 5:42:40 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E667F37B405 for ; Wed, 15 Jan 2003 05:42:36 -0800 (PST) Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6D5B843F43 for ; Wed, 15 Jan 2003 05:42:31 -0800 (PST) (envelope-from des@ofug.org) Received: by flood.ping.uio.no (Postfix, from userid 2602) id DF4715378; Wed, 15 Jan 2003 14:42:29 +0100 (CET) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Bruce Evans Cc: arch@FreeBSD.ORG Subject: Re: Buglet in disklabel References: <20030116003206.C23422-100000@gamplex.bde.org> From: Dag-Erling Smorgrav Date: Wed, 15 Jan 2003 14:42:29 +0100 In-Reply-To: <20030116003206.C23422-100000@gamplex.bde.org> (Bruce Evans's message of "Thu, 16 Jan 2003 00:35:00 +1100 (EST)") Message-ID: Lines: 19 User-Agent: Gnus/5.090007 (Oort Gnus v0.07) Emacs/21.2 (i386--freebsd) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bruce Evans writes: > Looks good. You could also remove the bzero() before one of the calls > to getasciilabel() (the one that passes a local variable that would > more obviously be unitialized if it weren't bzero()ed). Like this? @@ -858,7 +858,6 @@ warnx("can't reopen %s for reading", tmpfil); break; } - bzero((char *)&label, sizeof(label)); if (getasciilabel(fp, &label)) { *lp = label; if (writelabel(f, bootarea, lp) == 0) { DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 8:52:18 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 45C1B37B401; Wed, 15 Jan 2003 08:52:17 -0800 (PST) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 85D4843EB2; Wed, 15 Jan 2003 08:52:16 -0800 (PST) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.6/8.12.6) with ESMTP id h0FGqFro001374 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 15 Jan 2003 11:52:15 -0500 (EST) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h0FGqA875906; Wed, 15 Jan 2003 11:52:10 -0500 (EST) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15909.37306.656490.486061@grasshopper.cs.duke.edu> Date: Wed, 15 Jan 2003 11:52:10 -0500 (EST) To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. In-Reply-To: <14715.1042634253@critter.freebsd.dk> References: <14715.1042634253@critter.freebsd.dk> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Speaking of /dev, driver writers, and API/ABI decisions to be made before the 5.0-stable brach, I've got a minor axe to grind. Factory devices. Weren't you talking about changing the driver interface in such a way as to make factory devices easier to implement on FreeBSD? I would *love* to see this in 5.0-stable so that I don't have to support the clunky old way I came up with to handle it (conjuring a vnode out of thin air..) Or am I all wet, and its easy to do now? What I'm after is passing the struct file all the way down to open,close,ioctl,mmap,etc, and having a void * field in struct file that a driver can hang a softc pointer off of. That way an application can always open /dev/foo0 and not have to hunt around in the /dev namespace, looking for an unused minor /dev/foo299. The driver just looks at the struct file pointer it gets in ioctl for example, and isn't limited to the major/minor number of the underlying dev_t. This would be a real boon to people porting linux drivers (aka, vmware). Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 9: 9:23 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 50A5637B401 for ; Wed, 15 Jan 2003 09:09:22 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E784043F13 for ; Wed, 15 Jan 2003 09:09:20 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0FH8v9O017321; Wed, 15 Jan 2003 18:08:57 +0100 (CET) (envelope-from phk@freebsd.org) To: Andrew Gallatin Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Wed, 15 Jan 2003 11:52:10 EST." <15909.37306.656490.486061@grasshopper.cs.duke.edu> Date: Wed, 15 Jan 2003 18:08:57 +0100 Message-ID: <17320.1042650537@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <15909.37306.656490.486061@grasshopper.cs.duke.edu>, Andrew Gallatin writes: > > >Speaking of /dev, driver writers, and API/ABI decisions to be made >before the 5.0-stable brach, I've got a minor axe to grind. Factory >devices. > >Weren't you talking about changing the driver interface in such a way >as to make factory devices easier to implement on FreeBSD? I would >*love* to see this in 5.0-stable so that I don't have to support the >clunky old way I came up with to handle it (conjuring a vnode out of >thin air..) Or am I all wet, and its easy to do now? There are a number of ways to do this, none easy (IMO). I understand what you want, but I don't think we can credibly claim to get this into any working shape for 5-stable. But I will promise you significant "action" in this area in 6-current. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 10: 8:20 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7B1AE37B401; Wed, 15 Jan 2003 10:08:19 -0800 (PST) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id D160743F5F; Wed, 15 Jan 2003 10:08:18 -0800 (PST) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.6/8.12.6) with ESMTP id h0FI8Iro008785 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 15 Jan 2003 13:08:18 -0500 (EST) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h0FI8DO75972; Wed, 15 Jan 2003 13:08:13 -0500 (EST) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15909.41869.176059.969484@grasshopper.cs.duke.edu> Date: Wed, 15 Jan 2003 13:08:13 -0500 (EST) To: phk@freebsd.org Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. In-Reply-To: <17320.1042650537@critter.freebsd.dk> References: <15909.37306.656490.486061@grasshopper.cs.duke.edu> <17320.1042650537@critter.freebsd.dk> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG phk@freebsd.org writes: > In message <15909.37306.656490.486061@grasshopper.cs.duke.edu>, Andrew Gallatin > writes: > > > > > >Speaking of /dev, driver writers, and API/ABI decisions to be made > >before the 5.0-stable brach, I've got a minor axe to grind. Factory > >devices. > > > >Weren't you talking about changing the driver interface in such a way > >as to make factory devices easier to implement on FreeBSD? I would > >*love* to see this in 5.0-stable so that I don't have to support the > >clunky old way I came up with to handle it (conjuring a vnode out of > >thin air..) Or am I all wet, and its easy to do now? > > There are a number of ways to do this, none easy (IMO). > > I understand what you want, but I don't think we can credibly claim > to get this into any working shape for 5-stable. I obviously don't know this code as well as you do, but I'd think that adding a 'struct file *fp' pointer to the list of args that the various vops take would be all thats needed. What am I missing? Thanks, Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 10:14:27 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5EF7C37B401 for ; Wed, 15 Jan 2003 10:14:26 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4013143F6B for ; Wed, 15 Jan 2003 10:14:23 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0FIE09O017953; Wed, 15 Jan 2003 19:14:00 +0100 (CET) (envelope-from phk@freebsd.org) To: Andrew Gallatin Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Wed, 15 Jan 2003 13:08:13 EST." <15909.41869.176059.969484@grasshopper.cs.duke.edu> Date: Wed, 15 Jan 2003 19:14:00 +0100 Message-ID: <17952.1042654440@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <15909.41869.176059.969484@grasshopper.cs.duke.edu>, Andrew Gallatin writes: > > >Weren't you talking about changing the driver interface in such a way > > >as to make factory devices easier to implement on FreeBSD? I would > > >*love* to see this in 5.0-stable so that I don't have to support the > > >clunky old way I came up with to handle it (conjuring a vnode out of > > >thin air..) Or am I all wet, and its easy to do now? > > > > There are a number of ways to do this, none easy (IMO). > > > > I understand what you want, but I don't think we can credibly claim > > to get this into any working shape for 5-stable. > >I obviously don't know this code as well as you do, but I'd think >that adding a 'struct file *fp' pointer to the list of args >that the various vops take would be all thats needed. What am I >missing? "all the details" ? :-) There are a fair number of issues in this area that needs addressed, this is just one of them. Doing things right here will take more than a couple of months time (incl. testing), And without address to anybody in particular: it should not be done in a few hours time by some "HeldenHacker" who think this is trivial, there is a whole host of locking issues related to semi-magic devices like /dev/fd/* and similar which needs careful thought. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 10:43:50 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6968837B401; Wed, 15 Jan 2003 10:43:48 -0800 (PST) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id A104143EB2; Wed, 15 Jan 2003 10:43:47 -0800 (PST) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.6/8.12.6) with ESMTP id h0FIhlro011285 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Wed, 15 Jan 2003 13:43:47 -0500 (EST) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h0FIhga75999; Wed, 15 Jan 2003 13:43:42 -0500 (EST) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <15909.43997.980937.352619@grasshopper.cs.duke.edu> Date: Wed, 15 Jan 2003 13:43:41 -0500 (EST) To: phk@freebsd.org Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. In-Reply-To: <17952.1042654440@critter.freebsd.dk> References: <15909.41869.176059.969484@grasshopper.cs.duke.edu> <17952.1042654440@critter.freebsd.dk> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG phk@freebsd.org writes: > In message <15909.41869.176059.969484@grasshopper.cs.duke.edu>, Andrew Gallatin > writes: > > > > >Weren't you talking about changing the driver interface in such a way > > > >as to make factory devices easier to implement on FreeBSD? I would > > > >*love* to see this in 5.0-stable so that I don't have to support the > > > >clunky old way I came up with to handle it (conjuring a vnode out of > > > >thin air..) Or am I all wet, and its easy to do now? > > > > > > There are a number of ways to do this, none easy (IMO). > > > > > > I understand what you want, but I don't think we can credibly claim > > > to get this into any working shape for 5-stable. > > > >I obviously don't know this code as well as you do, but I'd think > >that adding a 'struct file *fp' pointer to the list of args > >that the various vops take would be all thats needed. What am I > >missing? > > "all the details" ? :-) ;) > There are a fair number of issues in this area that needs addressed, > this is just one of them. Doing things right here will take more > than a couple of months time (incl. testing), > > And without address to anybody in particular: it should not be done > in a few hours time by some "HeldenHacker" who think this is trivial, > there is a whole host of locking issues related to semi-magic devices > like /dev/fd/* and similar which needs careful thought. Hmm.. I don't propose actually converting existing drivers to use the new interface, I agree that would be very tricky and time consuming, and would require extensive testing. What I was proposing is to just push the struct file * down to open/close/ioctl/* for new drivers, and to do the conversion of existing drivers lazily. Old drivers get to add another arg they don't use to their open/close/ioctl/* functions. (mmap is a little different -- you need to make the opaque handle into a file pointer..) Drew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 10:46: 4 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 08CE437B401 for ; Wed, 15 Jan 2003 10:46:04 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1ED3843E4A for ; Wed, 15 Jan 2003 10:46:03 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0FIje9O018388; Wed, 15 Jan 2003 19:45:40 +0100 (CET) (envelope-from phk@freebsd.org) To: Andrew Gallatin Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Wed, 15 Jan 2003 13:43:41 EST." <15909.43997.980937.352619@grasshopper.cs.duke.edu> Date: Wed, 15 Jan 2003 19:45:40 +0100 Message-ID: <18387.1042656340@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <15909.43997.980937.352619@grasshopper.cs.duke.edu>, Andrew Gallatin writes: >Hmm.. I don't propose actually converting existing drivers to use the >new interface, I agree that would be very tricky and time consuming, >and would require extensive testing. > >What I was proposing is to just push the struct file * down to >open/close/ioctl/* for new drivers, and to do the conversion of >existing drivers lazily. Old drivers get to add another arg they >don't use to their open/close/ioctl/* functions. I'd prefer we do it once only, and do it right from the start. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 11:22: 3 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DD54837B401; Wed, 15 Jan 2003 11:22:02 -0800 (PST) Received: from rwcrmhc53.attbi.com (rwcrmhc53.attbi.com [204.127.198.39]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7EDFF43F3F; Wed, 15 Jan 2003 11:22:02 -0800 (PST) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org (12-232-168-4.client.attbi.com[12.232.168.4]) by rwcrmhc53.attbi.com (rwcrmhc53) with ESMTP id <2003011519220105300lg47pe>; Wed, 15 Jan 2003 19:22:02 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id LAA91575; Wed, 15 Jan 2003 11:22:01 -0800 (PST) Date: Wed, 15 Jan 2003 11:22:00 -0800 (PST) From: Julian Elischer To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. In-Reply-To: <14715.1042634253@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 15 Jan 2003, Poul-Henning Kamp wrote: > > I think we are now at a cross-road where it makes sense to nail > down the path for making DEVFS and GEOM non-optional components. > > The three steps in this process will be: > > Remove option "NODEVFS" in sys/conf and sys/*/conf. > > Remove option "NO_GEOM" in sys/conf and sys/*/conf. > > Remove MAKEDEV and all references to it from the tree. fix ports (e.g. emulation/rtc) that don't handle devfs correctly. > > There are _no_ sweeps over all drivers to change APIs etc. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 11:30:51 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1818D37B401 for ; Wed, 15 Jan 2003 11:30:50 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4355043F3F for ; Wed, 15 Jan 2003 11:30:49 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0FJUN9O018844; Wed, 15 Jan 2003 20:30:24 +0100 (CET) (envelope-from phk@freebsd.org) To: Julian Elischer Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Wed, 15 Jan 2003 11:22:00 PST." Date: Wed, 15 Jan 2003 20:30:23 +0100 Message-ID: <18843.1042659023@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Ju lian Elischer writes: > > >On Wed, 15 Jan 2003, Poul-Henning Kamp wrote: > >> >> I think we are now at a cross-road where it makes sense to nail >> down the path for making DEVFS and GEOM non-optional components. >> >> The three steps in this process will be: >> >> Remove option "NODEVFS" in sys/conf and sys/*/conf. >> >> Remove option "NO_GEOM" in sys/conf and sys/*/conf. >> >> Remove MAKEDEV and all references to it from the tree. > >fix ports (e.g. emulation/rtc) that don't handle devfs correctly. These ports are already broken relative to 5.0. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 11:48:51 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C170137B401; Wed, 15 Jan 2003 11:48:50 -0800 (PST) Received: from ebb.errno.com (ebb.errno.com [66.127.85.87]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0F1AF43F3F; Wed, 15 Jan 2003 11:48:50 -0800 (PST) (envelope-from sam@errno.com) Received: from melange (melange.errno.com [66.127.85.82]) (authenticated bits=0) by ebb.errno.com (8.12.5/8.12.1) with ESMTP id h0FJmnnN037408 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Wed, 15 Jan 2003 11:48:49 -0800 (PST)?g (envelope-from sam@errno.com)œ X-Authentication-Warning: ebb.errno.com: Host melange.errno.com [66.127.85.82] claimed to be melange Message-ID: <19a601c2bccf$1fdf3850$5a557f42@errno.com> From: "Sam Leffler" To: , "Poul-Henning Kamp" References: <14715.1042634253@critter.freebsd.dk> Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. Date: Wed, 15 Jan 2003 11:48:49 -0800 Organization: Errno Consulting MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > All the central developers I have talked to agree that this is the > direction we are going. Unless I am Terrybly mistaken, only the > speed of adoption is up for discussion at this point, the direction > is not. You and I talked about this briefly so I'll just voice my opinion publicly. I believe changes of this sort should wait until _after_ 5.1 is cut. This assumes that 5.1 is the "performance and stability" release that compels people to move production machines to a 5.x code base. If 5.1 is this kind of release then I'd want developers to focus their energy on performance and stability issues and not on changes of this sort. My concern is that yanking this code may expose problems that destabilize the system. While this certainly needs to be done I would like to see 5.1 come out quickly; so anything that might cause a slip should be considered carefully. It''s a hard call. I'm conservative when it comes to release engineering. Sam To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 11:56:16 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E6BD737B401 for ; Wed, 15 Jan 2003 11:56:14 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1B76043F13 for ; Wed, 15 Jan 2003 11:56:14 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0FJtm9O019069; Wed, 15 Jan 2003 20:55:49 +0100 (CET) (envelope-from phk@freebsd.org) To: "Sam Leffler" Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Wed, 15 Jan 2003 11:48:49 PST." <19a601c2bccf$1fdf3850$5a557f42@errno.com> Date: Wed, 15 Jan 2003 20:55:48 +0100 Message-ID: <19068.1042660548@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <19a601c2bccf$1fdf3850$5a557f42@errno.com>, "Sam Leffler" writes: >> All the central developers I have talked to agree that this is the >> direction we are going. Unless I am Terrybly mistaken, only the >> speed of adoption is up for discussion at this point, the direction >> is not. > >You and I talked about this briefly so I'll just voice my opinion publicly. I very much appreciate your input, private as well as publically. It was partly because of our discussion I decided to take this public so we can get the issue resolved. >I believe changes of this sort should wait until _after_ 5.1 is cut. This >assumes that 5.1 is the "performance and stability" release that compels >people to move production machines to a 5.x code base. If 5.1 is this kind >of release then I'd want developers to focus their energy on performance and >stability issues and not on changes of this sort. My concern is that >yanking this code may expose problems that destabilize the system. While >this certainly needs to be done I would like to see 5.1 come out quickly; so >anything that might cause a slip should be considered carefully. I don't really see how this can jeoparidize 5.1: All we do is remove a couple of badly supported functions at the administrative level: Not one single .c or .h file needs to be touched, only sys/conf and sys/i386/conf/LINT will be affected. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 12:51:46 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5D13137B401 for ; Wed, 15 Jan 2003 12:51:44 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id D38BC43E4A for ; Wed, 15 Jan 2003 12:51:43 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0FKph0i009516; Wed, 15 Jan 2003 12:51:43 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0FKpgxA009515; Wed, 15 Jan 2003 12:51:42 -0800 (PST) Date: Wed, 15 Jan 2003 12:51:42 -0800 (PST) From: Matthew Dillon Message-Id: <200301152051.h0FKpgxA009515@apollo.backplane.com> To: Peter Wemm Cc: Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: <20030115034234.A3A742A89E@canning.wemm.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I think we've established the usefulness of a VM Object descriptor. Now what about the system call API? At the moment I have: fd = getsysfd(int type, off_t size); I'm thinking of making it a more generic filesystem-optional rendezvous. But it would not work like a FIFO. Instead you would simply create a normal file and specify it. The normal file would only be used as a placemarker for the system descriptor to allow rendezvous (which is probably how FIFO should have been done in the first place, since adding type bits to the stat/inode is not extensible). int getsysfd(int type, int xfd, int64_t data) The file would only be used as a placemarker. i.e. for SYSFD_MEMORY the storage is in the VM Object which is NOT associated with the file in any way other then to serve as a rendezvous. The filesystem rendezvous would be optional, you could specify -1 for xfd. If you specify xfd you would be able to close() it immediate after the getsysfd() call. Example useage: xfd = open("/tmp/fubar", O_CREAT, 0666); memfd = getsysfd(SYSFD_MEMORY, xfd, 1024*1024); Or: memfd = getsysfd(SYSFD_MEMORY, -1, 1024*1024); This covers the basic SYSFD types that I would like to implement... SYSFD_MEMORY, SYSFD_MSGQ (a properly implemented multi-target message queue, something we sorely need), and SYSFD_TIMER (sophisticated timer support). memfd = getsysfd(SYSFD_MEMORY, opt_fd, object_size) timerfd = getsysfd(SYSFD_TIMER, opt_fd, timer_type_and_initial_count) msgqfd = getsysfd(SYSFD_MSGQ, opt_fd, my_target_identifier) I considered adding a path argument to getsysfd() but I think that would be overkill. We might want to add a separate flags argument, though: int getsysfd(int type, int xfd, int flags, int64_t data) instead of: int getsysfd(int type, int xfd, int64_t data) So, remember, this particular debate should be about the system call API to be sure it covers expected needs. Do we have enough arguments? Do we need more? Do we need less? -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 21:51: 3 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 106A537B401; Wed, 15 Jan 2003 21:51:02 -0800 (PST) Received: from ebb.errno.com (ebb.errno.com [66.127.85.87]) by mx1.FreeBSD.org (Postfix) with ESMTP id 826F343EB2; Wed, 15 Jan 2003 21:51:01 -0800 (PST) (envelope-from sam@errno.com) Received: from melange (melange.errno.com [66.127.85.82]) (authenticated bits=0) by ebb.errno.com (8.12.5/8.12.1) with ESMTP id h0G5p0nN039974 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Wed, 15 Jan 2003 21:51:00 -0800 (PST)?g (envelope-from sam@errno.com)œ X-Authentication-Warning: ebb.errno.com: Host melange.errno.com [66.127.85.82] claimed to be melange Message-ID: <1d8501c2bd23$3f9af4a0$5a557f42@errno.com> From: "Sam Leffler" To: Cc: References: <19068.1042660548@critter.freebsd.dk> Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. Date: Wed, 15 Jan 2003 21:51:00 -0800 Organization: Errno Consulting MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > >I believe changes of this sort should wait until _after_ 5.1 is cut. This > >assumes that 5.1 is the "performance and stability" release that compels > >people to move production machines to a 5.x code base. If 5.1 is this kind > >of release then I'd want developers to focus their energy on performance and > >stability issues and not on changes of this sort. My concern is that > >yanking this code may expose problems that destabilize the system. While > >this certainly needs to be done I would like to see 5.1 come out quickly; so > >anything that might cause a slip should be considered carefully. > > I don't really see how this can jeoparidize 5.1: All we do is remove > a couple of badly supported functions at the administrative level: > Not one single .c or .h file needs to be touched, only sys/conf > and sys/i386/conf/LINT will be affected. The way you described it there was more to it than just removing config glue. Why don't you create a patch for "removing each" so we can have something concrete to look at? My concern about all this is that we remove/stuff stuff only to find that the code has been in use and there's nothing to replace it. You've done this before and stated that the resulting pain is worthwhile because it improves the overall quality of the system. That may be true--and this may not happen this time--but, like I said, I think there's enough to do for 5.1 to not waste effort on something that can wait. Sam To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 15 22:36:54 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA0B937B401 for ; Wed, 15 Jan 2003 22:36:51 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E82C643F79 for ; Wed, 15 Jan 2003 22:36:50 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0G6aO9O024900; Thu, 16 Jan 2003 07:36:25 +0100 (CET) (envelope-from phk@freebsd.org) To: "Sam Leffler" Cc: arch@freebsd.org Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Wed, 15 Jan 2003 21:51:00 PST." <1d8501c2bd23$3f9af4a0$5a557f42@errno.com> Date: Thu, 16 Jan 2003 07:36:24 +0100 Message-ID: <24899.1042698984@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <1d8501c2bd23$3f9af4a0$5a557f42@errno.com>, "Sam Leffler" writes: >The way you described it there was more to it than just removing config >glue. Why don't you create a patch for "removing each" so we can have >something concrete to look at? For 5-stable/5.1-RELEASE I will be content simply to remove the config glue, since that achives the goal of restricting the "modes" drivers have to support. For -current after the branchpoint I will be removing the code too. 5-stable relevant patch: Poul-Henning Index: alpha/conf/SIMOS =================================================================== RCS file: /home/ncvs/src/sys/alpha/conf/SIMOS,v retrieving revision 1.23 diff -u -r1.23 SIMOS --- alpha/conf/SIMOS 5 Oct 2002 16:35:21 -0000 1.23 +++ alpha/conf/SIMOS 16 Jan 2003 06:32:27 -0000 @@ -18,8 +18,6 @@ ident SIMOS maxusers 10 -options NO_GEOM - options DEC_KN8AE options SIMOS options INET #InterNETworking Index: conf/NOTES =================================================================== RCS file: /home/ncvs/src/sys/conf/NOTES,v retrieving revision 1.1119 diff -u -r1.1119 NOTES --- conf/NOTES 1 Jan 2003 18:48:48 -0000 1.1119 +++ conf/NOTES 16 Jan 2003 06:33:56 -0000 @@ -642,7 +642,6 @@ options UDF #Universal Disk Format options UMAPFS #UID map filesystem options UNIONFS #Union filesystem -# options NODEVFS #disable devices filesystem # The xFS_ROOT options REQUIRE the associated ``options xFS'' options NFS_ROOT #NFS usable as root device Index: conf/options =================================================================== RCS file: /home/ncvs/src/sys/conf/options,v retrieving revision 1.365 diff -u -r1.365 options --- conf/options 26 Nov 2002 17:32:39 -0000 1.365 +++ conf/options 16 Jan 2003 06:32:11 -0000 @@ -87,7 +87,6 @@ DDB_UNATTENDED GDB_REMOTE_CHAT opt_ddb.h GDBSPEED opt_ddb.h -NO_GEOM opt_geom.h GEOM_AES opt_geom.h GEOM_BDE opt_geom.h GEOM_BSD opt_geom.h @@ -104,7 +103,6 @@ MD_ROOT opt_md.h MD_ROOT_SIZE opt_md.h NDGBPORTS opt_dgb.h -NODEVFS opt_devfs.h NTIMECOUNTER opt_ntp.h NSWAPDEV opt_swap.h PANIC_REBOOT_WAIT_TIME opt_panic.h Index: ia64/conf/SKI =================================================================== RCS file: /home/ncvs/src/sys/ia64/conf/SKI,v retrieving revision 1.8 diff -u -r1.8 SKI --- ia64/conf/SKI 13 Oct 2002 16:29:15 -0000 1.8 +++ ia64/conf/SKI 16 Jan 2003 06:32:42 -0000 @@ -33,8 +33,6 @@ makeoptions DEBUG=-g #Build kernel with gdb(1) debug symbols makeoptions NO_CPU_COPTFLAGS=true #Ignore any x86 CPUTYPE -options NO_GEOM - options SKI #Support for HP simulator options INET #InterNETworking #options INET6 #IPv6 communications protocols Index: powerpc/conf/GENERIC =================================================================== RCS file: /home/ncvs/src/sys/powerpc/conf/GENERIC,v retrieving revision 1.19 diff -u -r1.19 GENERIC --- powerpc/conf/GENERIC 19 Oct 2002 16:54:07 -0000 1.19 +++ powerpc/conf/GENERIC 16 Jan 2003 06:32:54 -0000 @@ -35,7 +35,6 @@ options INET #InterNETworking options INET6 #IPv6 communications protocols -options GEOM #GEOMetry subsystem options FFS #Berkeley Fast Filesystem options SOFTUPDATES #Enable FFS soft updates support options UFS_ACL #Support for access control lists -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 16 7:11:35 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DB35F37B401 for ; Thu, 16 Jan 2003 07:11:33 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 16E1343F18 for ; Thu, 16 Jan 2003 07:11:33 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.6/8.12.5) with SMTP id h0GFBPP4080203; Thu, 16 Jan 2003 10:11:26 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Thu, 16 Jan 2003 10:11:25 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Matthew Dillon Cc: Peter Wemm , Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <200301152051.h0FKpgxA009515@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 15 Jan 2003, Matthew Dillon wrote: > I think we've established the usefulness of a VM Object descriptor. Now > what about the system call API? At the moment I have: Since we appear to be building a new API and IPC semantic from the ground up, I think it would be worthwhile to look at some of the other weird, wonderful, or perhaps just bizarre, things that others have done in the past. I know that NetBSD is in the throes of implementing Mach IPC, for example. Using an existing API (if we can find one that applies) would let us avoid the traditional pitfalls in IPC API design, of which there are many. For example, allowing us to pass VM object references using a Mach port primitive, or even ancillary data on a UNIX domain socket, would provide at least some of the features you're suggesting, but with a more familiar face. An additional advantage to looking at the Mach direction would be that we could pool resources with NetBSD to get Mach IPC up and running, and then when our PPC port is more off the ground, do Darwin emulation :-). At one point, speaking of ancillary data, I had some local hacks to allow kernel modules to register internalization and externalization calls for ancillary data transfer. I then hooked up modules that allowed processes to pass security tokens and capabilities to one another; providing similar support for passing around memory references would provide similar semantics to what you're describing, since the object could easily be an anonymous map from /dev/zero. I.e., (very, very pseudocode): fd = open(/dev/zero, O_RDWR); p = mmap(fd); uds = socket(UNIX); connect(uds, /var/run/app_rendezvous); sendmsg(uds, "memory attached", p); So I'm not saying a new API would be the wrong thing to do, I just want us to explore the options and see which has the lowest impact vs biggest bang. One concern I have with introducing entirely new primitives is how to fit them into the MAC Framework (i.e., are there new objects that require labels that didn't have labels before, how to document and instrument the important operations). Another concern is application portability -- we've actually had a lot of luck with other OS's picking up kqueue(), but IPC is likely to be more controversial, especially if it overlaps existing functionality provided by other IPC primitives. And, I'd like to avoid any further System V IPC debacles, where the semantics are such a poor match for UNIX that it's almost impossible to do useful security things with them. :-) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 16 7:24:39 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D31C037B401; Thu, 16 Jan 2003 07:24:38 -0800 (PST) Received: from critter.freebsd.dk (esplanaden.cybercity.dk [212.242.40.114]) by mx1.FreeBSD.org (Postfix) with ESMTP id ACAE143E4A; Thu, 16 Jan 2003 07:24:37 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0GFO04u003078; Thu, 16 Jan 2003 16:24:00 +0100 (CET) (envelope-from phk@freebsd.org) To: Robert Watson Cc: Matthew Dillon , Peter Wemm , Terry Lambert , "Alan L. Cox" , arch@freebsd.org Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) From: phk@freebsd.org In-Reply-To: Your message of "Thu, 16 Jan 2003 10:11:25 EST." Date: Thu, 16 Jan 2003 16:24:00 +0100 Message-ID: <3077.1042730640@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Robe rt Watson writes: > >On Wed, 15 Jan 2003, Matthew Dillon wrote: > >> I think we've established the usefulness of a VM Object descriptor. Now >> what about the system call API? At the moment I have: > >Since we appear to be building a new API and IPC semantic from the ground >up, I think it would be worthwhile to look at some of the other weird, >wonderful, or perhaps just bizarre, things that others have done in the >past. I would like to see it being developed, tested, benchmarked and find at least one application before we add YAIPC to FreeBSD's sources... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 16 9:17:42 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EE53737B401; Thu, 16 Jan 2003 09:17:40 -0800 (PST) Received: from ebb.errno.com (ebb.errno.com [66.127.85.87]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7F3AE43F13; Thu, 16 Jan 2003 09:17:40 -0800 (PST) (envelope-from sam@errno.com) Received: from melange (melange.errno.com [66.127.85.82]) (authenticated bits=0) by ebb.errno.com (8.12.5/8.12.1) with ESMTP id h0GHHdnN042449 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NO); Thu, 16 Jan 2003 09:17:40 -0800 (PST)?g (envelope-from sam@errno.com)œ X-Authentication-Warning: ebb.errno.com: Host melange.errno.com [66.127.85.82] claimed to be melange Message-ID: <1eaf01c2bd83$2c5e2cd0$5a557f42@errno.com> From: "Sam Leffler" To: Cc: References: <24899.1042698984@critter.freebsd.dk> Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. Date: Thu, 16 Jan 2003 09:17:39 -0800 Organization: Errno Consulting MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4807.1700 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > In message <1d8501c2bd23$3f9af4a0$5a557f42@errno.com>, "Sam Leffler" writes: > > >The way you described it there was more to it than just removing config > >glue. Why don't you create a patch for "removing each" so we can have > >something concrete to look at? > > For 5-stable/5.1-RELEASE I will be content simply to remove the config > glue, since that achives the goal of restricting the "modes" drivers > have to support. > > For -current after the branchpoint I will be removing the code too. > > 5-stable relevant patch: > <...patch deleted...> This approach is fine with me. Sam To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 16 11: 1:47 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0B6837B401; Thu, 16 Jan 2003 11:01:45 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3C37143EB2; Thu, 16 Jan 2003 11:01:45 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.6/8.12.6) with ESMTP id h0GJ1i0i023582; Thu, 16 Jan 2003 11:01:44 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.6/8.12.6/Submit) id h0GJ1htn023581; Thu, 16 Jan 2003 11:01:43 -0800 (PST) Date: Thu, 16 Jan 2003 11:01:43 -0800 (PST) From: Matthew Dillon Message-Id: <200301161901.h0GJ1htn023581@apollo.backplane.com> To: Robert Watson Cc: Peter Wemm , Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hmm. Well, as an owner of one of the original NeXT boxes I am quite familiar with the Mach auxillary data mechanism. It has rather serious issues not the least of which being that it is difficult (or impossible) to cache mappings to make the mechanism efficient. This is because the userland and kernel do not agree on the mapping prior to the message being sent. This has not changed since the NeXT days and forcing the kernel to repeatedly reinterpret and remap userland pointers on a per message basis is a major problem. To do it right we would need a way to extend the interface to support pre-registered data areas. For example, instead of constructing a random mach message and calling mach_msg() on it what we really need is a system call to allocate a mach_msg() which can then be managed both in kernel and user space, removing the mapping overhead for the message header (mach_msg_header_t) when mach_msg() is called. Similarly, registering send and receive data areas would be needed to solve this endemic problem with Mach. This would result in an order of magnitude faster processing of the Mach message. The idea of using the mach port primitive is not a bad idea, though. Mach ports are very similar to Amiga message ports and messages and I really liked the Amiga mechanism. I think I could implement the Mach port primitives quite easily (at least the core support for it), and it would certainly apply to the SYSFD_TIMER and SYSFD_MSGQ brainstorm. I'm not sure it applies to SYSFD_MEMORY, however, because you still need a handle (file descriptor) and you still need the flexibility to mmap() ports of the VM object however and wherever you want, and the mach messaging interface is not suited to that at all. The mach messaging interface is designed for discrete data sets. So we still have the problem of allocating the file descriptor to represent the VM Object for SYSFD_MEMORY. Right offhand I do not recall a Mach equivalent for something like that. We could do it with a system call (ala getsysfd()), or we could do it with a device. I have to say that I don't see how using /dev/zero is any more portable then creating a new system call. Extending an existing mechanism does not in any way make an implementation more portable. In fact, in my view, extending an existing mechanism far beyond its original intention can result in more confusion and more difficulty because there may not be any clear way to determine whether the new API is actually supported or not by the target architecture. /dev/zero is seriously overused and that creates a hassle for anyone trying to use its extended mechanisms in a portable program. The Diablo news system I wrote has four different ways of managing shared memory and mapped memory due to all the weird extensions different operating systems have done with mmap() and /dev/zero, and it wasn't fun making it all work. -Matt Matthew Dillon :... :So I'm not saying a new API would be the wrong thing to do, I just want us :to explore the options and see which has the lowest impact vs biggest :bang. One concern I have with introducing entirely new primitives is how :to fit them into the MAC Framework (i.e., are there new objects that :require labels that didn't have labels before, how to document and :instrument the important operations). Another concern is application :portability -- we've actually had a lot of luck with other OS's picking up :kqueue(), but IPC is likely to be more controversial, especially if it :overlaps existing functionality provided by other IPC primitives. : :And, I'd like to avoid any further System V IPC debacles, where the :semantics are such a poor match for UNIX that it's almost impossible to do :useful security things with them. :-) : :Robert N M Watson FreeBSD Core Team, TrustedBSD Projects :robert@fledge.watson.org Network Associates Laboratories To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 16 11:26:54 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8559E37B401 for ; Thu, 16 Jan 2003 11:26:53 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id C978A43F5B for ; Thu, 16 Jan 2003 11:26:52 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.6/8.12.5) with SMTP id h0GJQkP4055278; Thu, 16 Jan 2003 14:26:46 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Thu, 16 Jan 2003 14:26:46 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Matthew Dillon Cc: Peter Wemm , Terry Lambert , "Alan L. Cox" , arch@FreeBSD.ORG Subject: Re: getsysfd() patch #1 (Re: Virtual memory question) In-Reply-To: <200301161901.h0GJ1htn023581@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 16 Jan 2003, Matthew Dillon wrote: > The idea of using the mach port primitive is not a bad idea, though. You might want to take a look at the NetBSD src/sys/compat/mach and src/sys/compat/darwin trees to see if anything there is useful. When I last looked closely, it was mostly prototypes, but glancing at it now it looks like they've made a lot of progress. This URL seems to be useful: http://www.FreeBSD.org/cgi/cvsweb.cgi/src/sys/compat/ Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 17 17:16:32 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 24A2837B405 for ; Fri, 17 Jan 2003 17:16:31 -0800 (PST) Received: from khavrinen.lcs.mit.edu (khavrinen.lcs.mit.edu [18.24.4.193]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3AF1743E4A for ; Fri, 17 Jan 2003 17:16:30 -0800 (PST) (envelope-from wollman@khavrinen.lcs.mit.edu) Received: from khavrinen.lcs.mit.edu (localhost [IPv6:::1]) by khavrinen.lcs.mit.edu (8.12.6/8.12.6) with ESMTP id h0I1GRCJ082718 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=OK); Fri, 17 Jan 2003 20:16:28 -0500 (EST) (envelope-from wollman@khavrinen.lcs.mit.edu) Received: (from wollman@localhost) by khavrinen.lcs.mit.edu (8.12.6/8.12.6/Submit) id h0I1GRHr082717; Fri, 17 Jan 2003 20:16:27 -0500 (EST) (envelope-from wollman) Date: Fri, 17 Jan 2003 20:16:27 -0500 (EST) From: Garrett Wollman Message-Id: <200301180116.h0I1GRHr082717@khavrinen.lcs.mit.edu> To: dillon@apollo.backplane.com Subject: Re: Virtual memory question X-Newsgroups: mit.lcs.mail.freebsd-arch In-Reply-To: <200301140339.h0E3dVQa073160@apollo.backplane.com> References: <20030114002831.1C8C12A89E@canning.wemm.org> <3E2381F8.85BB90A0@imimic.com> Organization: MIT Laboratory for Computer Science Cc: arch@FreeBSD.org Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 In article <200301140339.h0E3dVQa073160@apollo.backplane.com> you write: > (2) I don't see how/where one specifies the size of the memory object > in shm_open(). Does this mean we have to implement ftruncate()? Yes. The Stevens book (UNP volume 2 IIRC) describes the POSIX IPC interfaces quite well. - -GAWollman -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iD8DBQE+KKrpI+eG6b7tlG4RAgL2AJ9gsuo/DA3SXkA2ijrMSQWUVIYrawCffWRb QNiLUz6J2lAeoHU6Tyxb8gc= =FtSu -----END PGP SIGNATURE----- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 17 20:34:33 2003 Delivered-To: freebsd-arch@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 931) id 6CE8837B401; Fri, 17 Jan 2003 20:34:27 -0800 (PST) Date: Fri, 17 Jan 2003 20:34:27 -0800 From: Juli Mallett To: arch@FreeBSD.org Cc: standards@FreeBSD.org Subject: ps(1) printing the same variable a few times. Message-ID: <20030117203427.A18843@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i Organisation: The FreeBSD Project X-Alternate-Addresses: , , , , X-Towel: Yes X-LiveJournal: flata, jmallett X-Negacore: Yes Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG The following prevents non-user-added-via-[oO] options from repeating themselves. This follows both POLA and BSD ps(1) mode of GNU ps in the behaviour. I'd just like a quick once-over to assure I'm not on the crack. Thanx, juli. %%% Index: extern.h =================================================================== RCS file: /home/ncvs/src/bin/ps/extern.h,v retrieving revision 1.29 diff -d -u -r1.29 extern.h --- extern.h 1 Nov 2002 23:26:20 -0000 1.29 +++ extern.h 18 Jan 2003 04:31:32 -0000 @@ -52,6 +52,7 @@ void cputime(KINFO *, VARENT *); int donlist(void); void elapsed(KINFO *, VARENT *); +VARENT *find_varentry(VAR *); const char *fmt_argv(char **, char *, size_t); double getpcpu(const KINFO *); void kvar(KINFO *, VARENT *); @@ -63,7 +64,7 @@ void lockname(KINFO *, VARENT *); void mwchan(KINFO *, VARENT *); void pagein(KINFO *, VARENT *); -void parsefmt(const char *); +void parsefmt(const char *, int); void pcpu(KINFO *, VARENT *); void pmem(KINFO *, VARENT *); void pri(KINFO *, VARENT *); Index: keyword.c =================================================================== RCS file: /home/ncvs/src/bin/ps/keyword.c,v retrieving revision 1.58 diff -d -u -r1.58 keyword.c --- keyword.c 24 Oct 2002 00:00:57 -0000 1.58 +++ keyword.c 18 Jan 2003 04:31:32 -0000 @@ -54,7 +54,7 @@ #include "ps.h" -static VAR *findvar(char *); +static VAR *findvar(char *, int); static int vcmp(const void *, const void *); /* Compute offset in common structures. */ @@ -223,7 +223,7 @@ } void -parsefmt(const char *p) +parsefmt(const char *p, int user) { static struct varent *vtail; char *tempstr, *tempstr1; @@ -234,7 +234,7 @@ char *cp; VAR *v; struct varent *vent; - +again: /* * If an item contains an equals sign, it specifies a column * header, may contain embedded separator characters and @@ -248,8 +248,18 @@ cp = tempstr; tempstr = NULL; } - if (cp == NULL || !(v = findvar(cp))) + if (cp == NULL || !(v = findvar(cp, user))) continue; + if (!user) { + /* + * If the user is NOT adding this field manually, + * get on with our lives if this VAR is already + * represented in the list. + */ + vent = find_varentry(v); + if (vent != NULL) + continue; + } if ((vent = malloc(sizeof(struct varent))) == NULL) errx(1, "malloc failed"); vent->var = malloc(sizeof(*vent->var)); @@ -273,7 +283,7 @@ } static VAR * -findvar(char *p) +findvar(char *p, int user) { VAR *v, key; char *hp; @@ -290,7 +300,7 @@ warnx("%s: illegal keyword specification", p); eval = 1; } - parsefmt(v->alias); + parsefmt(v->alias, user); return ((VAR *)NULL); } if (!v) { Index: ps.c =================================================================== RCS file: /home/ncvs/src/bin/ps/ps.c,v retrieving revision 1.59 diff -d -u -r1.59 ps.c --- ps.c 24 Oct 2002 00:00:57 -0000 1.59 +++ ps.c 18 Jan 2003 04:31:32 -0000 @@ -193,7 +193,7 @@ prtheader = ws.ws_row > 5 ? ws.ws_row : 22; break; case 'j': - parsefmt(jfmt); + parsefmt(jfmt, 0); _fmt = 1; jfmt[0] = '\0'; break; @@ -201,7 +201,7 @@ showkey(); exit(0); case 'l': - parsefmt(lfmt); + parsefmt(lfmt, 0); _fmt = 1; lfmt[0] = '\0'; break; @@ -217,14 +217,14 @@ dropgid = 1; break; case 'O': - parsefmt(o1); - parsefmt(optarg); - parsefmt(o2); + parsefmt(o1, 1); + parsefmt(optarg, 1); + parsefmt(o2, 1); o1[0] = o2[0] = '\0'; _fmt = 1; break; case 'o': - parsefmt(optarg); + parsefmt(optarg, 1); _fmt = 1; break; #if defined(LAZY_PS) @@ -270,13 +270,13 @@ xflg++; /* XXX: intuitive? */ break; case 'u': - parsefmt(ufmt); + parsefmt(ufmt, 0); sortby = SORTCPU; _fmt = 1; ufmt[0] = '\0'; break; case 'v': - parsefmt(vfmt); + parsefmt(vfmt, 0); sortby = SORTMEM; _fmt = 1; vfmt[0] = '\0'; @@ -292,7 +292,7 @@ xflg = 1; break; case 'Z': - parsefmt(Zfmt); + parsefmt(Zfmt, 0); Zfmt[0] = '\0'; break; case '?': @@ -325,7 +325,7 @@ errx(1, "%s", errbuf); if (!_fmt) - parsefmt(dfmt); + parsefmt(dfmt, 0); /* XXX - should be cleaner */ if (!all && ttydev == NODEV && pid == -1 && !nuids) { @@ -454,6 +454,18 @@ errx(1, "No users specified"); return uids; +} + +VARENT * +find_varentry(VAR *v) +{ + struct varent *vent; + + for (vent = vhead; vent; vent = vent->next) { + if (strcmp(vent->var->name, v->name) == 0) + return vent; + } + return NULL; } static void %%% -- Juli Mallett AIM: BSDFlata -- IRC: juli on EFnet. OpenDarwin, Mono, FreeBSD Developer. ircd-hybrid Developer, EFnet addict. FreeBSD on MIPS-Anything on FreeBSD. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 18 13: 7: 0 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 27D0537B401 for ; Sat, 18 Jan 2003 13:06:59 -0800 (PST) Received: from eagle.sharma-home.net (cpe-66-1-147-119.ca.sprintbbd.net [66.1.147.119]) by mx1.FreeBSD.org (Postfix) with ESMTP id B3C3943F13 for ; Sat, 18 Jan 2003 13:06:58 -0800 (PST) (envelope-from adsharma@eagle.sharma-home.net) Received: by eagle.sharma-home.net (Postfix, from userid 500) id B2A5A8115; Sat, 18 Jan 2003 13:09:40 -0800 (PST) Date: Sat, 18 Jan 2003 13:09:40 -0800 From: Arun Sharma To: Matthew Dillon Cc: arch@FreeBSD.ORG Subject: Re: Virtual memory question Message-ID: <20030118210940.GA22024@sharma-home.net> References: <20030111224444.94D102A89E@canning.wemm.org> <200301112342.h0BNgj9a048596@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200301112342.h0BNgj9a048596@apollo.backplane.com> User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Jan 11, 2003 at 03:42:45PM -0800, Matthew Dillon wrote: > > This is basically how shared memory works, except that shared memory > is managed outside the file descriptor framework. I would love to > see a shared memory object that is managed inside the file > descriptor framework, sort of like 'pipe()'. I do not see any need > to use /dev/zero to implement the feature, though, because it will > not improve portability. > > How about something like: > > getmemfd(). Hugetlbfs that was recently introduced in Linux uses a similar fd based mechanism to implement shared memory. -Arun To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 18 14:14:27 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 48FEC37B401 for ; Sat, 18 Jan 2003 14:14:26 -0800 (PST) Received: from dragon.nuxi.com (trang.nuxi.com [66.93.134.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id B88CB43ED8 for ; Sat, 18 Jan 2003 14:14:25 -0800 (PST) (envelope-from obrien@NUXI.com) Received: from dragon.nuxi.com (obrien@localhost [127.0.0.1]) by dragon.nuxi.com (8.12.6/8.12.2) with ESMTP id h0IMEOIx077006; Sat, 18 Jan 2003 14:14:24 -0800 (PST) (envelope-from obrien@dragon.nuxi.com) Received: (from obrien@localhost) by dragon.nuxi.com (8.12.6/8.12.6/Submit) id h0IMD8nY077003; Sat, 18 Jan 2003 14:13:08 -0800 (PST) Date: Sat, 18 Jan 2003 14:13:08 -0800 From: "David O'Brien" To: Sam Leffler Cc: arch@FreeBSD.ORG Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. Message-ID: <20030118221308.GH70151@dragon.nuxi.com> Reply-To: arch@FreeBSD.ORG Mail-Followup-To: David O'Brien , Sam Leffler , arch@FreeBSD.ORG References: <14715.1042634253@critter.freebsd.dk> <19a601c2bccf$1fdf3850$5a557f42@errno.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <19a601c2bccf$1fdf3850$5a557f42@errno.com> User-Agent: Mutt/1.4i X-Operating-System: FreeBSD 5.0-CURRENT Organization: The NUXI BSD Group X-Pgp-Rsa-Fingerprint: B7 4D 3E E9 11 39 5F A3 90 76 5D 69 58 D9 98 7A X-Pgp-Rsa-Keyid: 1024/34F9F9D5 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 15, 2003 at 11:48:49AM -0800, Sam Leffler wrote: > You and I talked about this briefly so I'll just voice my opinion publicly. > I believe changes of this sort should wait until _after_ 5.1 is cut. This > assumes that 5.1 is the "performance and stability" release that compels > people to move production machines to a 5.x code base. Relative to your view, where would the RELENG_5 branch (ie, 5-STABLE) be cut? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 18 14:28:34 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 38FA137B401 for ; Sat, 18 Jan 2003 14:28:33 -0800 (PST) Received: from dragon.nuxi.com (trang.nuxi.com [66.93.134.19]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5B75443F65 for ; Sat, 18 Jan 2003 14:28:32 -0800 (PST) (envelope-from obrien@NUXI.com) Received: from dragon.nuxi.com (obrien@localhost [127.0.0.1]) by dragon.nuxi.com (8.12.6/8.12.2) with ESMTP id h0IMSTIx077144; Sat, 18 Jan 2003 14:28:29 -0800 (PST) (envelope-from obrien@dragon.nuxi.com) Received: (from obrien@localhost) by dragon.nuxi.com (8.12.6/8.12.6/Submit) id h0IMREQ8077121; Sat, 18 Jan 2003 14:27:14 -0800 (PST) Date: Sat, 18 Jan 2003 14:27:13 -0800 From: "David O'Brien" To: Sam Leffler Cc: arch@FreeBSD.ORG Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. Message-ID: <20030118222713.GI70151@dragon.nuxi.com> Reply-To: arch@FreeBSD.ORG Mail-Followup-To: David O'Brien , Sam Leffler , arch@FreeBSD.ORG References: <14715.1042634253@critter.freebsd.dk> <19a601c2bccf$1fdf3850$5a557f42@errno.com> <20030118221308.GH70151@dragon.nuxi.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030118221308.GH70151@dragon.nuxi.com> User-Agent: Mutt/1.4i X-Operating-System: FreeBSD 5.0-CURRENT Organization: The NUXI BSD Group X-Pgp-Rsa-Fingerprint: B7 4D 3E E9 11 39 5F A3 90 76 5D 69 58 D9 98 7A X-Pgp-Rsa-Keyid: 1024/34F9F9D5 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Jan 18, 2003 at 02:13:08PM -0800, David O'Brien wrote: > On Wed, Jan 15, 2003 at 11:48:49AM -0800, Sam Leffler wrote: > > You and I talked about this briefly so I'll just voice my opinion publicly. > > I believe changes of this sort should wait until _after_ 5.1 is cut. This > > assumes that 5.1 is the "performance and stability" release that compels > > people to move production machines to a 5.x code base. > > Relative to your view, where would the RELENG_5 branch (ie, 5-STABLE) be > cut? To possibly make the conversation go faster; I'll assume the answer is we branch RELENG_5 at 5.1-RELEASE. f Let me preface this by saying I highly value and respect your opinions. The problem is making only the minimal change before the RELENG_5 branch point will really make MFC'ing harder. We had a disaster with 4-CURRENT and RELENG_3 in which we could not MFC critical kernel fixes. The Project (as we operate) learned a hard lesson, and I would just like to remind people of that. I would like to see a patch from PHK that implements his preference. Some of us could run that to gain some insight into this issue. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 18 15: 4:48 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BD58037B401 for ; Sat, 18 Jan 2003 15:04:46 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E540D43F13 for ; Sat, 18 Jan 2003 15:04:45 -0800 (PST) (envelope-from phk@freebsd.org) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id h0IN4FK1036672; Sun, 19 Jan 2003 00:04:15 +0100 (CET) (envelope-from phk@freebsd.org) To: arch@freebsd.org Cc: Sam Leffler Subject: Re: HEADSUP: DEVFS and GEOM mandatorification timeline. From: phk@freebsd.org In-Reply-To: Your message of "Sat, 18 Jan 2003 14:27:13 PST." <20030118222713.GI70151@dragon.nuxi.com> Date: Sun, 19 Jan 2003 00:04:15 +0100 Message-ID: <36671.1042931055@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20030118222713.GI70151@dragon.nuxi.com>, "David O'Brien" writes: >Let me preface this by saying I highly value and respect your opinions. >The problem is making only the minimal change before the RELENG_5 branch >point will really make MFC'ing harder. We had a disaster with 4-CURRENT >and RELENG_3 in which we could not MFC critical kernel fixes. The >Project (as we operate) learned a hard lesson, and I would just like to >remind people of that. I agree on this. We need to decide which ABI/API's we want to support in that branch before we branch it. >I would like to see a patch from PHK that implements his preference. >Some of us could run that to gain some insight into this issue. Well, you probably are runing that already: As long as you don't have the NODEVFS and NO_GEOM options in your kernel, you are running the code base which I want to see in all our releases after 5.0-RELEASE irrespective of when the RELENG_5 branchpoint is. The removal of NODEVFS and NO_GEOM options means that about 700-1000 lines can be mechanically unifdef(1)'ed, but it is not to me important to do this before the next release, because this can be painlessly MFC'ed later. Subsequently, after then RELENG_5 branch, the non-trivial fallout of removing NODEVFS and NO_GEOM can be worked on, there are some simplifications in the vnode layer and some devices clone routines etc, but these needs shaken out in -current before the will be MFC'ed. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message