From owner-freebsd-arch Sun Jun 24 9:31:40 2001 Delivered-To: freebsd-arch@freebsd.org Received: from peace.mahoroba.org (peace.calm.imasy.or.jp [202.227.26.34]) by hub.freebsd.org (Postfix) with ESMTP id B5F8737B406; Sun, 24 Jun 2001 09:31:25 -0700 (PDT) (envelope-from ume@mahoroba.org) Received: from localhost (IDENT:OH2hwKyzlYgzEtWJ5iprGG7pdQxUFRRL2H3OOkMuttbmbv7BXTD1U8iQX2Qag7pz@localhost [::1]) (authenticated as ume with CRAM-MD5) by peace.mahoroba.org (8.11.4/8.11.4/peace) with ESMTP/inet6 id f5OGV9R76497; Mon, 25 Jun 2001 01:31:09 +0900 (JST) (envelope-from ume@mahoroba.org) Date: Mon, 25 Jun 2001 01:31:06 +0900 (JST) Message-Id: <20010625.013106.78752396.ume@mahoroba.org> To: brooks@one-eyed-alien.net Cc: hackers@FreeBSD.ORG, brian@Awfulhak.org, phk@critter.freebsd.dk, arch@FreeBSD.ORG Subject: Re: cloning network interfaces From: Hajimu UMEMOTO In-Reply-To: <20010622125113.A30459@Odin.AC.HMC.Edu> References: <20010611142030.A15283@Odin.AC.HMC.Edu> <20010613.040716.115941864.ume@mahoroba.org> <20010622125113.A30459@Odin.AC.HMC.Edu> X-Mailer: xcite1.38> Mew version 1.95b119 on Emacs 20.7 / Mule 4.0 =?iso-2022-jp?B?KBskQjJWMWMbKEIp?= X-PGP-Public-Key: http://www.imasy.org/~ume/publickey.asc X-PGP-Fingerprint: 6B 0C 53 FC 5D D0 37 91 05 D0 B3 EF 36 9B 6A BC X-URL: http://www.imasy.org/~ume/ X-Operating-System: FreeBSD 5.0-CURRENT Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG >>>>> On Fri, 22 Jun 2001 12:51:13 -0700 >>>>> Brooks Davis said: brooks> Ok, after a week and a half of doing other things, I've got a patch brooks> together which adds interface cloning based on NetBSD's code. The brooks> difference is that you may pass an interface of the from gif# if you brooks> don't need a specific number. The ioctl now returns a potentialy brooks> modified ifreq which contains the new interface name. This changes the brooks> way drivers implement cloning in that they may return a different unit brooks> then they were passed and they must do their own resource management brooks> rather then relying on the clone functionality in sys/net/if.c to do it brooks> for them. brooks> The patch is at: brooks> http://people.freebsd.org/~brooks/patches/gif.diff brooks> The patch can be applied as follows (you need to make the directories): brooks> cd /usr/src brooks> mkdir sys/modules/if_gif sys/modules/if_stf brooks> patch < /tmp/gif.diff brooks> The patch does the following: brooks> - adds interface cloning support to the kernel brooks> - adds interface cloning support to ifconfig brooks> - makes gif clonable brooks> - makes gif usable as a module brooks> - removes the need for NGIF and gif.h brooks> - removes va_args usage in in_gif_input to remove a warning brooks> - removes gif dependencies from stf brooks> - makes stf usable as a module It seems fine to me. I just tried it on my box. You forget to include prototype change of in_gif_input() in sys/net/if_gif.h. BTW, why did you change gif_ioctl() to gif_ifioctl()? gif related modules are shared among *BSDs and maintained in KAME CVS repository. Could you please keep local changes small as possible? -- Hajimu UMEMOTO @ Internet Mutual Aid Society Yokohama, Japan ume@mahoroba.org ume@bisd.hitachi.co.jp ume@{,jp.}FreeBSD.org http://www.imasy.org/~ume/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Jun 24 11:37:37 2001 Delivered-To: freebsd-arch@freebsd.org Received: from saturn.bsdhome.com (unknown [24.25.2.193]) by hub.freebsd.org (Postfix) with ESMTP id 772D337B405; Sun, 24 Jun 2001 11:37:32 -0700 (PDT) (envelope-from bsd@bsdhome.com) Received: from neutrino.bsdhome.com (jupiter [192.168.220.13]) by saturn.bsdhome.com (8.11.3/8.11.3) with ESMTP id f5OIbVC06079; Sun, 24 Jun 2001 14:37:31 -0400 (EDT) Received: (from bsd@localhost) by neutrino.bsdhome.com (8.11.4/8.11.4) id f5OIbQS42821; Sun, 24 Jun 2001 14:37:26 -0400 (EDT) (envelope-from bsd) Date: Sun, 24 Jun 2001 14:37:26 -0400 From: Brian Dean To: freebsd-arch@freebsd.org Cc: freebsd-current@freebsd.org Subject: patch for using hardware debug registers for kernel debugging Message-ID: <20010624143726.B41098@neutrino.bsdhome.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, I've worked up a simple patch to allow the use of the hardware debug registers within the kernel debugger. Support is very rudimentary at this stage: you have to set the actual register values yourself. I've included a handy little program to help with that, though. This patch will allow you to set a hardware watchpoint to watch up to 16 bytes of data (up to 4 watchpoints of up to 4 bytes each) and generate a debug trap when that data is read or written (depends on the type of the watchpoint specified, "wo", or "rw"). If you suspect a memory overwrite bug and know the address being overwritten, using these registers can find it for you fast. While the watch is in effect, unlike with a software watch point, the CPU runs at full speed. This is the primary benefit of the hardware debug support and can make debug sessions take only minutes that otherwise literally take days. An execution breakpoint may also be specified (type "ex"), but this is probably only useful if you are debugging code in ROM. Please see: http://people.freebsd.org/~bsd/ddb/ Please review and comment. This support, while very low level at this point, but can be real handy. -Brian -- Brian Dean bsd@FreeBSD.org bsd@bsdhome.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jun 25 0:47:13 2001 Delivered-To: freebsd-arch@freebsd.org Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9]) by hub.freebsd.org (Postfix) with ESMTP id 179DF37B405 for ; Mon, 25 Jun 2001 00:47:11 -0700 (PDT) (envelope-from julian@elischer.org) Received: from elischer.org (InterJet.elischer.org [192.168.1.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id CAA78080; Mon, 25 Jun 2001 02:13:02 -0700 (PDT) Message-ID: <3B36EA40.9061C6CD@elischer.org> Date: Mon, 25 Jun 2001 00:37:36 -0700 From: Julian Elischer X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 5.0-CURRENT i386) X-Accept-Language: en, hu MIME-Version: 1.0 To: Jason Evans Cc: arch@freebsd.org Subject: Re: Updated KSEs paper References: <20010622184626.B47186@canonware.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Jason Evans wrote: > > A number of people are going to discuss various subjects at a meeting > during USENIX, one of them being KSEs. I won't be able to attend (a sister > is getting married that day), but wanted to make an updated version of the > paper available to avoid others having to fix the same design problems as > I've already fixed. > > The paper still is not by any means perfect, but it addresses most of the > issues that people brought up in previous discussions on this mailing list. > Feedback and suggestions are welcome. > > http://people.freebsd.org/~jasone/refs/freebsd_kse/freebsd_kse.html > http://people.freebsd.org/~jasone/refs/freebsd_kse.ps I have some suggestions as to how this can be achieved efficiently which may require some changes to the suggested API but I can't get it all written down before I leave for USENIX on Wed. I will be able to explain it on the whiteboard there and hopefully get it written down after (or during). > > Jason > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message -- +------------------------------------+ ______ _ __ | __--_|\ Julian Elischer | \ U \/ / hard at work in | / \ julian@elischer.org +------>x USA \ a very strange | ( OZ ) \___ ___ | country ! +- X_.---._/ presently in San Francisco \_/ \\ v To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jun 25 2: 7:14 2001 Delivered-To: freebsd-arch@freebsd.org Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9]) by hub.freebsd.org (Postfix) with ESMTP id 9039D37B407 for ; Mon, 25 Jun 2001 02:07:05 -0700 (PDT) (envelope-from julian@elischer.org) Received: from elischer.org (InterJet.elischer.org [192.168.1.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id DAA78351; Mon, 25 Jun 2001 03:36:05 -0700 (PDT) Message-ID: <3B36FDB4.74C96ACB@elischer.org> Date: Mon, 25 Jun 2001 02:00:36 -0700 From: Julian Elischer X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 5.0-CURRENT i386) X-Accept-Language: en, hu MIME-Version: 1.0 To: Jason Evans Cc: arch@freebsd.org Subject: Re: Updated KSEs paper References: <20010622184626.B47186@canonware.com> Content-Type: text/plain; charset=iso-8859-2 Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Jason Evans wrote: > > A number of people are going to discuss various subjects at a meeting > during USENIX, one of them being KSEs. I won't be able to attend (a sister > is getting married that day), but wanted to make an updated version of the > paper available to avoid others having to fix the same design problems as > I've already fixed. > > The paper still is not by any means perfect, but it addresses most of the > issues that people brought up in previous discussions on this mailing list. > Feedback and suggestions are welcome. > > http://people.freebsd.org/~jasone/refs/freebsd_kse/freebsd_kse.html > http://people.freebsd.org/~jasone/refs/freebsd_kse.ps > > Jason > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message here are some comments on the KSE API as I see it.... ksec_new(ksec_id, cpu_id, kseg_id): Run a thread on this KSEC (whose ID is ksec_id), on the CPU with ID cpu_id, as part of the KSEG with ID kseg_id. [julian's comment] The UTS knows it's running. It has a pointer to a mailbox which will magically be correct for this KSE. It knows what threads are runnable and can just jump into one of them. If none are runnable, it needs some type of system call that 'yields' the processor. maybe a variant of the usleep call? It got here because either something blocked, or a new KSE was created. These two cases are effectively identical.. we have a place to schedule a thread. It doesn't matter if it's new or recycled... ksec_preempt(ksec_id, kse_state): The KSEC with ID ksec_id was preempted, with userland execution state ksec_state. [..] The action of pre-empting this might write the state into the userland context storage (with a copyout()). Then the UTS wouldn't need notification, right now, just at the next time it might make a difference, i.e. when it next goes to schedule something. If we do it right, it will find this thread on the runnable queue at that time without us needing to do an explicit notification. ksec_block(ksec_id): The KSEC with ID ksec_id has blocked in the kernel. [..] this is treated exactly like the first case. I don't think it needs a separate upcall. ksec_unblock(ksec_id, kse_state): The KSEC with ID ksec_id has completed in the kernel, with with userland execution state ksec_state. [..] I don't thik a separate upcall is always needed. On completion, the state of the returning syscall is written into the userland context, and the context is writen to the "completed and runnable queue". The next time the UTS runs it adds the contents of this queue to its runnable queue, and schedules as per normal. signal(sig_t signum): The process received a signal numbered signum. [..] we haven't decided exactly what this means. This will do as well as anything else I've seen mentionned and better than most. The following system calls are necessary: void kse_init(struct kseu *context): Start using KSEs. context contains the necessary data for the kernel to make upcalls. This function appears to return every time an upcall is made. Initially, there is only one KSEG (ID 0), which has a concurrency level of 1. [..] whenever a concurrency is added the caller must supply a different stack (or dummy stack) for the system call to return on. an added concurrency is in effect an added KSE. Each KSE needs a different stack to upcall on (though the stack may be small as it will have bounded use.) "context" includes pointers to the mailbox that will be used by that KSE. The multiple returns of this call will all magically have that mailbox in their hand so you can preload it with anything the UTS will need on an upcall. int kseg_create(void): Create a KSEG and return its KSEG ID (unique within this process), or -1 if there is an error (resource limit exceeded). [..] I see this as basically an extension of the next call. You automatically get a KSE with that KSEG so it does every thing that creating a new KSE does, and needs the 'context' variable that a KSE would need. int kseg_concurrency(int kseg_id, int adjust): Adjust the concurrency of the KSEG with ID kseg_id. Decrementing the concurrency to 0 destroys the KSEG, as soon as there are no more active KSECs in the KSEG. If adjust is 0, the KSEG is not modified, but the concurrency is still returned. This system call returns the KSEG's instantaneous concurrency level after adjusting it. [..] If you increase the concurrency, you have created new KSEs. They need their own separate upcall stacks (maybe only dummy stacks but.... In any case you need to allocate them one by one. Just setting a concurrency to "what it is now + 2" is not going to work because the new KSEs don;t know where to return to. int kseg_bind(int kseg_id, int cpu_id): Bind the KSEG with ID kseg_id to the CPU with ID cpu_id. This system call returns the CPU ID that the KSEG is bound to, or -1 if there is an error (invalid CPU ID, or the KSEG's concurrency is greater than 1). [..] I think the KSEG can bind itself. Same for priority.. no need to specify KSEG.. It's implicit. [..] We also need a 'yield' version of the usleep call. Note that a completing syscall that is already sleeping may reawaken the yielded KSE in order to complete after which it will upcall again in order to let the UTS schedule the satidfied thread. We also need a KSE_EXIT() for when we know we don't need it any more. I also argue with the following assertion: "Additionally, soft processor affinity for KSEs is important to performance. KSEs are not generally bound to CPUs, so KSEs that belong to the same KSEG can potentially compete with each other for the same processor; soft processor affinity tends to reduce such competition, in addition to well-known benefits of processor affinity. " I would argue that limiting (HARD LIMIT) one KSE per KSEG per processor has no ill effects and simplifies some housekeeping. KSECs can move between KSEs in the same KSEG isn a soft-affinity manner to achieve the same thing and being able to guarantee that the KSEs of a KSEG are never competing for the same processro ensures that they will never pre-empt each other which in turn simplifies soem other locking assumptions that must be made both inthe kernel and in the UTS. (Not proven but my gut feeling). Thus on a uniprocessor, the will only ever be as many KSEs as there are KSEGs. Since blocking syscalls return, this has no effect on the threading picture. There are still Multiple KSECs available. In 3.6.1 You prove that we an have enough storage to store thread state of KSECs. I would like to suggest that it can be proven as follows: Every user thread includes a thread control block that includes enough storage for thread context. Since every system call is made by a thread, and the 'context' information for the KSE on which the syscall is being made inclides a pointer to that storage, the blocked and resuming syscalls have that storage available to store their state. The context structures can be of a fixed known format and include an pointer to be used in linking them together in the 'completed and runnable' queue pointed to by the KSEU structure that is handed to the UTS by the upcall. Therefore, there is quaranteed to be enough storage. 3.6.2 Per-upcall event ordering Since in my scheme there is only one kind of upcall (well, I think signals can also be made to look the same), there is no ordering problem. All information is presented to the UTS at the same time and it can decide which it wants to handle first. in the section: "3.7 Upcall parallelism This section is not yet adequately fleshed out. Issues to consider: " [varous issues shown] Using my scheme this is not an issue. "What is your scheme?" I hear you ask. Basically in implementation if the above scheme with a few twists. 1/ Starting a KSE (as above) gives it it's mailbox. 2/ The KSE is only runnable on a processor on which there is no KSE from that KSEG already running. It tries really hard not to shift CPUs. No other KSE will be using that mailbox, thus no other processor in that KSEG. 3/ The mailbox includes a location that the kernel will look at to find a pointer to the (in userspace) thread context block (KSEU?). When the UTS schedules a thread, it fills in this location. until then it is NULL, meaning that the UTS itself is running. All the time the thread is running this pointer os valid so even if the thread is pre-empted, without warning by the kernel, the pointer can be used to store it's state. 4/ When a process is blocked and an upcall happens, the kernel zero's out that location, and takes a copy of it in teh KSEC that stores the syscall state. 5/ When a syscall is continued, and completes, the location given above (which was stored along with the sleeping syscall state) is used to store the state of the returning syscall, just as if it had returned and then done a yield(). It is then linked onto a list of 'completed syscalls' held by the kernel. 6/ When the next upcall into that KSEG is performed, it first reaps all the completed syscall blocks, and hangs them off the mailbox for the upcalling KSE in a known location. The UTS when it runs from the upcall discovers all the completed syscalls, which, to it look like a whole list of yield()'d threads, and puts them onto its run-queue according to the priority of each, then schedules the next highest priority thread. enough for now.. more on the whiteboard at USENIX.. (what you're not going? We'll take notes, ok?)  -- +------------------------------------+ ______ _ __ | __--_|\ Julian Elischer | \ U \/ / hard at work in | / \ julian@elischer.org +------>x USA \ a very strange | ( OZ ) \___ ___ | country ! +- X_.---._/ presently in San Francisco \_/ \\ v To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jun 25 13:16: 2 2001 Delivered-To: freebsd-arch@freebsd.org Received: from odin.ac.hmc.edu (Odin.AC.HMC.Edu [134.173.32.75]) by hub.freebsd.org (Postfix) with ESMTP id 9B26837B406; Mon, 25 Jun 2001 13:15:55 -0700 (PDT) (envelope-from brdavis@odin.ac.hmc.edu) Received: (from brdavis@localhost) by odin.ac.hmc.edu (8.11.0/8.11.0) id f5PKFkH24121; Mon, 25 Jun 2001 13:15:46 -0700 Date: Mon, 25 Jun 2001 13:15:46 -0700 From: Brooks Davis To: Hajimu UMEMOTO Cc: hackers@FreeBSD.ORG, brian@Awfulhak.org, phk@critter.freebsd.dk, arch@FreeBSD.ORG Subject: Re: cloning network interfaces Message-ID: <20010625131546.C30423@Odin.AC.HMC.Edu> References: <20010611142030.A15283@Odin.AC.HMC.Edu> <20010613.040716.115941864.ume@mahoroba.org> <20010622125113.A30459@Odin.AC.HMC.Edu> <20010625.013106.78752396.ume@mahoroba.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="DIOMP1UsTsWJauNi" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010625.013106.78752396.ume@mahoroba.org>; from ume@mahoroba.org on Mon, Jun 25, 2001 at 01:31:06AM +0900 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --DIOMP1UsTsWJauNi Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Jun 25, 2001 at 01:31:06AM +0900, Hajimu UMEMOTO wrote: > It seems fine to me. > I just tried it on my box. You forget to include prototype change of > in_gif_input() in sys/net/if_gif.h. It's defined in sys/netinet/in_gif.h and I forgot to include it in my diff. Sorry about that. > BTW, why did you change gif_ioctl() to gif_ifioctl()? gif related > modules are shared among *BSDs and maintained in KAME CVS repository. > Could you please keep local changes small as possible? I had renamed it when I introduced the /dev/gif device and an ioctl for that. I just forgot to rename it. Sorry about that. -- Brooks --=20 Any statement of the form "X is the one, true Y" is FALSE. PGP fingerprint 655D 519C 26A7 82E7 2529 9BF0 5D8E 8BE9 F238 1AD4 --DIOMP1UsTsWJauNi Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.4 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQE7N5vxXY6L6fI4GtQRAgnCAKCaI8FDrAFA9CCRLxnD9ZvWtpTrvwCfTYVP FJVFlR2wEwoUYvhX9oRt5I4= =ziAJ -----END PGP SIGNATURE----- --DIOMP1UsTsWJauNi-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Jun 25 19:44:57 2001 Delivered-To: freebsd-arch@freebsd.org Received: from tasogare.imasy.or.jp (tasogare.imasy.or.jp [202.227.24.5]) by hub.freebsd.org (Postfix) with ESMTP id 063E937B401; Mon, 25 Jun 2001 19:44:52 -0700 (PDT) (envelope-from iwasaki@jp.FreeBSD.org) Received: from localhost (iwasaki.imasy.or.jp [202.227.24.92]) by tasogare.imasy.or.jp (8.11.3+3.4W/8.11.3/tasogare) with ESMTP/inet id f5Q2inI59462; Tue, 26 Jun 2001 11:44:49 +0900 (JST) (envelope-from iwasaki@jp.FreeBSD.org) To: arch@freebsd.org Cc: audit@freebsd.org, athlete@kta.att.ne.jp, iwasaki@jp.freebsd.org Subject: CFR: Crusoe LongRun Support X-Mailer: Mew version 1.94.1 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20010626114448O.iwasaki@jp.FreeBSD.org> Date: Tue, 26 Jun 2001 11:44:48 +0900 From: Mitsuru IWASAKI X-Dispatcher: imput version 20000228(IM140) Lines: 23 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, I'm going to commit the patches for Transmeta Crusoe LongRun Support. This was originally created by HATTORI-san http://home.att.ne.jp/delta/athlete/longrun/longrun.html as a device driver, then I made cleanups and adding sysctl interface support, like this. hw.crusoe.longrun: 2 hw.crusoe.frequency: 600 hw.crusoe.voltage: 1600 hw.crusoe.percentage: 100 Only hw.crusoe.longrun is changeable, valid values are 0, 1, 2 and 3. The latest patches against sys/sys/i386/i386/identcpu.c at http://people.freebsd.org/~iwasaki/apm/sys-longrun-20010626.diff I'd like to have the patches reviewed in terms of sysctl namespace, security issues and other problems. Thanks To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jun 26 3:45:15 2001 Delivered-To: freebsd-arch@freebsd.org Received: from whale.sunbay.crimea.ua (whale.sunbay.crimea.ua [212.110.138.65]) by hub.freebsd.org (Postfix) with ESMTP id 1212037B405; Tue, 26 Jun 2001 03:45:03 -0700 (PDT) (envelope-from ru@whale.sunbay.crimea.ua) Received: (from ru@localhost) by whale.sunbay.crimea.ua (8.11.2/8.11.2) id f5QAiuV87552; Tue, 26 Jun 2001 13:44:56 +0300 (EEST) (envelope-from ru) Date: Tue, 26 Jun 2001 13:44:56 +0300 From: Ruslan Ermilov To: arch@FreeBSD.org, current@FreeBSD.org Subject: [CFR] ucred.cr_gid Message-ID: <20010626134456.B86114@sunbay.com> Mail-Followup-To: arch@FreeBSD.org, current@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Could someone please take a look at it before I commit this? ----- Forwarded message from Ruslan Ermilov ----- Date: Fri, 22 Jun 2001 18:05:09 +0300 From: Ruslan Ermilov To: arch@FreeBSD.org, current@FreeBSD.org Subject: ucred.cr_gid Message-ID: <20010622180509.D31008@sunbay.com> Mail-Followup-To: arch@FreeBSD.org, current@FreeBSD.org Hi! The attached patch replaces ucred.cr_groups[0] with ucred.cr_gid. This is mostly needed for POSIX alignment. setegid(2) etc. should not change supplementary groups set. Also, type of 's group.gr_gid changed to a more natural gid_t (also as in POSIX). getgrouplist(3)'s and initgroups(3)'s prototypes fixed. getgrouplist(3) has been also fixed to not duplicate the primary group, and always return number of suplementary groups, even if ngroups is zero (similar to sysctl(3)). Assorted changes: cmsgcred.cmcred_egid New kproc_info.ki_gid New portal_cred.pcr_gid New xucred.cr_gid New I'm not sure what to do with xucred. Also, I'm not sure about KINFO_PROC_SIZE on ia64 and PowerPC. Please review. See also ChangeLog. Cheers, -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age ----- End forwarded message ----- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jun 26 8:19:19 2001 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id ABCF637B406; Tue, 26 Jun 2001 08:19:12 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.3/8.11.3) with SMTP id f5QFIuf99867; Tue, 26 Jun 2001 11:18:56 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Tue, 26 Jun 2001 11:18:56 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Ruslan Ermilov Cc: arch@FreeBSD.org, current@FreeBSD.org Subject: Re: [CFR] ucred.cr_gid In-Reply-To: <20010626134456.B86114@sunbay.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 26 Jun 2001, Ruslan Ermilov wrote: > Could someone please take a look at it before I commit this? I won't get a chance to properly review this until I'm at USENIX tomorrow. If you're willing to hold off for about a week, I'd be happy to give it a fairly detailed review: I had some thoughts of doing this when I originally merged ucred and pcred a few weeks ago, but decided to hold off. I'm generally fairly positive about this change, but would be interested in hearing Bruce's thoughts on any compatibility issues, in particular, with respects to the behavior of userland processes with expectations about the old behavior. Obviously, this is a change that is very sensitive to subtle semantic changes on calls--on the other hand, I think moving towards making the supplementary groups being independent from the effect gid is a good goal, as it simplifies our credential code, and improves compatibility. > Date: Fri, 22 Jun 2001 18:05:09 +0300 > From: Ruslan Ermilov > To: arch@FreeBSD.org, current@FreeBSD.org > Subject: ucred.cr_gid > Message-ID: <20010622180509.D31008@sunbay.com> > Mail-Followup-To: arch@FreeBSD.org, current@FreeBSD.org > > Hi! > > The attached patch replaces ucred.cr_groups[0] with ucred.cr_gid. This > is mostly needed for POSIX alignment. setegid(2) etc. should not change > supplementary groups set. > > Also, type of 's group.gr_gid changed to a more natural gid_t > (also as in POSIX). Sounds good, I think this change was bandied about once before and perhaps simply didn't get committed. > getgrouplist(3)'s and initgroups(3)'s prototypes fixed. getgrouplist(3) > has been also fixed to not duplicate the primary group, and always > return number of suplementary groups, even if ngroups is zero (similar > to sysctl(3)). Having not looked at the patch yet, just need to make sure I point out the following areas that are sensitive to this type of change: linux and other ABI emulation, where semantic mapping of this sort is already performed, as well as userland applications managing groups. > Assorted changes: > > cmsgcred.cmcred_egid New This is an ABI change that will break applications compiled for older versions of FreeBSD. Is this a change that applications can detect via some sort of sizeof/sanity check on cmsg results? > kproc_info.ki_gid New > portal_cred.pcr_gid New > xucred.cr_gid New > > I'm not sure what to do with xucred. Probably reflect changes made in ucred fairly closely. I'll try to give you a detailed code review in a couple of days. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 0: 6:35 2001 Delivered-To: freebsd-arch@freebsd.org Received: from bazooka.unixfreak.org (bazooka.unixfreak.org [63.198.170.138]) by hub.freebsd.org (Postfix) with ESMTP id 5F6DF37B405 for ; Wed, 27 Jun 2001 00:06:29 -0700 (PDT) (envelope-from dima@unixfreak.org) Received: from hornet.unixfreak.org (hornet [63.198.170.140]) by bazooka.unixfreak.org (Postfix) with ESMTP id AB5F13E2F for ; Wed, 27 Jun 2001 00:06:28 -0700 (PDT) To: arch@freebsd.org Subject: Peer credentials on a Unix domain socket Date: Wed, 27 Jun 2001 00:06:28 -0700 From: Dima Dorfman Message-Id: <20010627070628.AB5F13E2F@bazooka.unixfreak.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi folks, Currently, there is no reliable way for a server listening on a Unix domain socket to find out the credentials of its peer until the peer sends something over the socket. Finding its credentials can be useful if the server only wants to accept connections from certain users. We already have SCM_CREDS, which will send the peer's credentials along with a message, but this is *not* sufficient as it may be unacceptable for the server to wait until the peer sends something; think of DoS attacked. Times don't help, either; think of SYN flood-like attacks. I would like to propose implementing such a facility as a socket option, LOCAL_PEERCRED. The payload would be am xucred structure with the effective credentials of the connect(2) caller. Granted these may not be the credentials of the process using the socket (think descriptor passing), but it doesn't matter; if a process hands a descriptor off to something else, it should be trusting it not to abuse it (this is a feature: think of opening a privileged port and dropping privileges). This has been discussed at least twice before, and nobody has a better idea. Again, I would like to stress the two requirements: (1) the accept(2) caller must be able to reliably obtain the effective credentials of the connect(2) caller, and (2) the accept(2) caller must be able to do (1) without relying on the connect(2) caller to send data (SCM_CREDS doesn't meet (2)). Patch attached. Comments? Suggestions? Thanks in advance, Dima Dorfman dima@unixfreak.org Index: sys/un.h =================================================================== RCS file: /stl/src/FreeBSD/src/sys/sys/un.h,v retrieving revision 1.17 diff -u -r1.17 un.h --- sys/un.h 1999/12/29 04:24:49 1.17 +++ sys/un.h 2001/06/27 06:51:18 @@ -46,12 +46,16 @@ char sun_path[104]; /* path name (gag) */ }; +/* Socket options. */ +#define LOCAL_PEERCRED 0x001 /* retrieve peer credentials */ + #ifdef _KERNEL struct mbuf; struct socket; int uipc_usrreq __P((struct socket *so, int req, struct mbuf *m, struct mbuf *nam, struct mbuf *control)); +int uipc_ctloutput __P((struct socket *so, struct sockopt *sopt)); int unp_connect2 __P((struct socket *so, struct socket *so2)); void unp_dispose __P((struct mbuf *m)); int unp_externalize __P((struct mbuf *rights)); Index: sys/unpcb.h =================================================================== RCS file: /stl/src/FreeBSD/src/sys/sys/unpcb.h,v retrieving revision 1.11 diff -u -r1.11 unpcb.h --- sys/unpcb.h 2000/05/26 02:06:59 1.11 +++ sys/unpcb.h 2001/06/27 06:51:18 @@ -80,7 +80,14 @@ int unp_cc; /* copy of rcv.sb_cc */ int unp_mbcnt; /* copy of rcv.sb_mbcnt */ unp_gen_t unp_gencnt; /* generation count of this instance */ + int unp_flags; /* flags */ + struct xucred unp_peercred; /* peer credentials, if applicable */ }; + +/* + * Flags in unp_flags. + */ +#define UNP_HAVEPC 0x001 /* unp_peercred filled in? */ #define sotounpcb(so) ((struct unpcb *)((so)->so_pcb)) Index: kern/uipc_proto.c =================================================================== RCS file: /stl/src/FreeBSD/src/sys/kern/uipc_proto.c,v retrieving revision 1.21 diff -u -r1.21 uipc_proto.c --- kern/uipc_proto.c 1999/10/11 15:19:11 1.21 +++ kern/uipc_proto.c 2001/06/27 06:51:18 @@ -51,7 +51,7 @@ static struct protosw localsw[] = { { SOCK_STREAM, &localdomain, 0, PR_CONNREQUIRED|PR_WANTRCVD|PR_RIGHTS, - 0, 0, 0, 0, + 0, 0, 0, &uipc_ctloutput, 0, 0, 0, 0, 0, &uipc_usrreqs Index: kern/uipc_usrreq.c =================================================================== RCS file: /stl/src/FreeBSD/src/sys/kern/uipc_usrreq.c,v retrieving revision 1.66 diff -u -r1.66 uipc_usrreq.c --- kern/uipc_usrreq.c 2001/05/25 16:59:07 1.66 +++ kern/uipc_usrreq.c 2001/06/27 06:51:18 @@ -434,6 +434,23 @@ uipc_send, uipc_sense, uipc_shutdown, uipc_sockaddr, sosend, soreceive, sopoll }; + +int +uipc_ctloutput(so, sopt) + struct socket *so; + struct sockopt *sopt; +{ + struct unpcb *unp = sotounpcb(so); + int error; + + if (sopt->sopt_dir == SOPT_GET && sopt->sopt_name == LOCAL_PEERCRED && + unp->unp_flags & UNP_HAVEPC) + error = sooptcopyout(sopt, &unp->unp_peercred, + sizeof(unp->unp_peercred)); + else + error = EOPNOTSUPP; + return (error); +} /* * Both send and receive buffers are allocated PIPSIZ bytes of buffering @@ -654,6 +671,12 @@ unp3->unp_addr = (struct sockaddr_un *) dup_sockaddr((struct sockaddr *) unp2->unp_addr, 1); + bzero(&unp3->unp_peercred, sizeof(unp3->unp_peercred)); + unp3->unp_peercred.cr_uid = p->p_ucred->cr_uid; + unp3->unp_peercred.cr_ngroups = p->p_ucred->cr_ngroups; + bcopy(p->p_ucred->cr_groups, unp3->unp_peercred.cr_groups, + sizeof(unp3->unp_peercred.cr_groups)); + unp3->unp_flags |= UNP_HAVEPC; so2 = so3; } error = unp_connect2(so, so2); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 0: 9:32 2001 Delivered-To: freebsd-arch@freebsd.org Received: from whale.sunbay.crimea.ua (whale.sunbay.crimea.ua [212.110.138.65]) by hub.freebsd.org (Postfix) with ESMTP id DA7A937B406; Wed, 27 Jun 2001 00:09:15 -0700 (PDT) (envelope-from ru@whale.sunbay.crimea.ua) Received: (from ru@localhost) by whale.sunbay.crimea.ua (8.11.2/8.11.2) id f5R795J04362; Wed, 27 Jun 2001 10:09:05 +0300 (EEST) (envelope-from ru) Date: Wed, 27 Jun 2001 10:09:05 +0300 From: Ruslan Ermilov To: Robert Watson Cc: arch@FreeBSD.org, current@FreeBSD.org Subject: Re: [CFR] ucred.cr_gid Message-ID: <20010627100905.A2097@sunbay.com> Mail-Followup-To: Robert Watson , arch@FreeBSD.org, current@FreeBSD.org References: <20010626134456.B86114@sunbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from rwatson@FreeBSD.org on Tue, Jun 26, 2001 at 11:18:56AM -0400 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Jun 26, 2001 at 11:18:56AM -0400, Robert Watson wrote: > > On Tue, 26 Jun 2001, Ruslan Ermilov wrote: > > > Could someone please take a look at it before I commit this? > > I won't get a chance to properly review this until I'm at USENIX tomorrow. > If you're willing to hold off for about a week, I'd be happy to give it a > fairly detailed review: I had some thoughts of doing this when I > originally merged ucred and pcred a few weeks ago, but decided to hold > off. I'm generally fairly positive about this change, but would be > interested in hearing Bruce's thoughts on any compatibility issues, in > particular, with respects to the behavior of userland processes with > expectations about the old behavior. Obviously, this is a change that is > very sensitive to subtle semantic changes on calls--on the other hand, I > think moving towards making the supplementary groups being independent > from the effect gid is a good goal, as it simplifies our credential code, > and improves compatibility. > At least one compatibility issue here is that it's no longer possible to use initgroups(3) to set the effective group ID. > > Date: Fri, 22 Jun 2001 18:05:09 +0300 > > From: Ruslan Ermilov > > To: arch@FreeBSD.org, current@FreeBSD.org > > Subject: ucred.cr_gid > > Message-ID: <20010622180509.D31008@sunbay.com> > > Mail-Followup-To: arch@FreeBSD.org, current@FreeBSD.org > > > > Hi! > > > > The attached patch replaces ucred.cr_groups[0] with ucred.cr_gid. This > > is mostly needed for POSIX alignment. setegid(2) etc. should not change > > supplementary groups set. > > > > Also, type of 's group.gr_gid changed to a more natural gid_t > > (also as in POSIX). > > Sounds good, I think this change was bandied about once before and perhaps > simply didn't get committed. > Some of the assorted changes were committed as part of Hesiod import from NetBSD. > > getgrouplist(3)'s and initgroups(3)'s prototypes fixed. getgrouplist(3) > > has been also fixed to not duplicate the primary group, and always > > return number of suplementary groups, even if ngroups is zero (similar > > to sysctl(3)). > > Having not looked at the patch yet, just need to make sure I point out the > following areas that are sensitive to this type of change: linux and other > ABI emulation, where semantic mapping of this sort is already performed, > as well as userland applications managing groups. > I think my patch handles these. > > Assorted changes: > > > > cmsgcred.cmcred_egid New > > This is an ABI change that will break applications compiled for older > versions of FreeBSD. Is this a change that applications can detect via > some sort of sizeof/sanity check on cmsg results? > I can't see how this would break old applications. > > kproc_info.ki_gid New > > portal_cred.pcr_gid New > > xucred.cr_gid New > > > > I'm not sure what to do with xucred. > > Probably reflect changes made in ucred fairly closely. > I mean, I'm not sure if we should preserve the 4.2's size of this structure or no, and if so, how to actually do it. Theoretically, this could be done by placing cr_gid in a union with _cr_unused1 and #define that untangles the fact that cr_gid is in a union, but that define would have to be ``#define cr_gid ...'' which is too bad. > I'll try to give you a detailed code review in a couple of days. > Thanks! Cheers, -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 1:42:59 2001 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id C04C637B401; Wed, 27 Jun 2001 01:42:52 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id SAA20407; Wed, 27 Jun 2001 18:42:48 +1000 Date: Wed, 27 Jun 2001 18:40:53 +1000 (EST) From: Bruce Evans X-Sender: bde@besplex.bde.org To: Ruslan Ermilov Cc: Robert Watson , arch@FreeBSD.ORG, current@FreeBSD.ORG Subject: Re: [CFR] ucred.cr_gid In-Reply-To: <20010627100905.A2097@sunbay.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 27 Jun 2001, Ruslan Ermilov wrote: > On Tue, Jun 26, 2001 at 11:18:56AM -0400, Robert Watson wrote: > > ... > > off. I'm generally fairly positive about this change, but would be > > interested in hearing Bruce's thoughts on any compatibility issues, in > > particular, with respects to the behavior of userland processes with > > expectations about the old behavior. Obviously, this is a change that is Me too :-). I don't know much about this except that it is related to longstanding bugs in gid management. > At least one compatibility issue here is that it's no longer possible > to use initgroups(3) to set the effective group ID. I think this shows that keeping the egid in group lists is intentional. The only bug in the current implementation seems to be that NGROUPS_MAX is 1 too small. The first gid in group lists is conventionally always the egid, but there must be space for NGROUPS_MAX "supplementary" groups, so statically allocated group lists must have size NGROUPS_MAX+1, but they currently (all?) have size NGROUPS_MAX. POSIX.1-200x documents this for getgroups(2) -- returning the egid is optional, and getgroups() may return {NGROUPS_MAX}+1 entries. I think the semantics of getgroups(), setgroups() and initgroups() shouldn't be changed. To set a really supplemental gid (one not affected by setegid(), setgroups() must put the gid in the list after the first entry even if it is is the egid). In the kernel, the problem is not really changed by keeping the egid in a separate variable. I currently slightly prefer keeping it in group lists. Binary compatibility could be preserve by hacking NGROUPS_MAX to NGROUPS_MAX - 1 (ugh). I don't see how to preserve source level compatibility. You have to change either the semantics by not putting the egid in group lists, or NGROUPS_MAX to NGROUPS_MAX+1 in many places. Portable applications need the latter change anyway. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 3:41:30 2001 Delivered-To: freebsd-arch@freebsd.org Received: from whale.sunbay.crimea.ua (whale.sunbay.crimea.ua [212.110.138.65]) by hub.freebsd.org (Postfix) with ESMTP id 5159037B409; Wed, 27 Jun 2001 03:41:17 -0700 (PDT) (envelope-from ru@whale.sunbay.crimea.ua) Received: (from ru@localhost) by whale.sunbay.crimea.ua (8.11.2/8.11.2) id f5RAeid49078; Wed, 27 Jun 2001 13:40:44 +0300 (EEST) (envelope-from ru) Date: Wed, 27 Jun 2001 13:40:44 +0300 From: Ruslan Ermilov To: Bruce Evans Cc: Robert Watson , arch@FreeBSD.ORG, current@FreeBSD.ORG Subject: Re: [CFR] ucred.cr_gid Message-ID: <20010627134044.A23159@sunbay.com> Mail-Followup-To: Bruce Evans , Robert Watson , arch@FreeBSD.ORG, current@FreeBSD.ORG References: <20010627100905.A2097@sunbay.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from bde@zeta.org.au on Wed, Jun 27, 2001 at 06:40:53PM +1000 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jun 27, 2001 at 06:40:53PM +1000, Bruce Evans wrote: > On Wed, 27 Jun 2001, Ruslan Ermilov wrote: > > > On Tue, Jun 26, 2001 at 11:18:56AM -0400, Robert Watson wrote: > > > ... > > > off. I'm generally fairly positive about this change, but would be > > > interested in hearing Bruce's thoughts on any compatibility issues, in > > > particular, with respects to the behavior of userland processes with > > > expectations about the old behavior. Obviously, this is a change that is > > Me too :-). I don't know much about this except that it is related to > longstanding bugs in gid management. > > > At least one compatibility issue here is that it's no longer possible > > to use initgroups(3) to set the effective group ID. > > I think this shows that keeping the egid in group lists is intentional. > It's hard to say actually. 4.3BSD up to Tahoe and Net/1 had (in user.h) explicit holder for egid and NGROUPS supplementary group IDs. > The only bug in the current implementation seems to be that NGROUPS_MAX > is 1 too small. The first gid in group lists is conventionally always > the egid, but there must be space for NGROUPS_MAX "supplementary" groups, > so statically allocated group lists must have size NGROUPS_MAX+1, but > they currently (all?) have size NGROUPS_MAX. POSIX.1-200x documents this > for getgroups(2) -- returning the egid is optional, and getgroups() may > return {NGROUPS_MAX}+1 entries. > What's wrong with keeping cr_gid in a separate structure member? Continuing to keep it inside the cr_groups[] would cause us to deal with NGROUPS_MAX vs. NGROUPS_MAX + 1 calculations all over the place inside the kernel. This IMHO only unnecessary complicates the things. We could still preserve the old behavior of getgroups(2) returning the effective GID, but this only makes sense if we also preserve the semantics of setgroups(2) setting the effective GID, which is bogus; setgroups(2) should only be allowed to set the supplementary group IDs, like most other OSes do, including NetBSD since 1995. > I think the semantics of getgroups(), setgroups() and initgroups() > shouldn't be changed. > This isn't possible, as if we continue to return egid with getgroups(), it will now return maximum {NGROUPS_MAX} + 1 gids, as opposed to the currently documented "no more than NGROUPS_MAX will ever be returned", thus breaking backwards compatibility anyway. > To set a really supplemental gid (one not affected by setegid(), > setgroups() must put the gid in the list after the first entry > even if it is is the egid). > > In the kernel, the problem is not really changed by keeping the egid > in a separate variable. I currently slightly prefer keeping it in > group lists. > Again, why? > Binary compatibility could be preserve by hacking NGROUPS_MAX to > NGROUPS_MAX - 1 (ugh). I don't see how to preserve source level > compatibility. You have to change either the semantics by not > putting the egid in group lists, or NGROUPS_MAX to NGROUPS_MAX+1 > in many places. Portable applications need the latter change anyway. > BTW, in a second pass, I've found one place where I missed the obvious change. Index: kern_prot.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_prot.c,v retrieving revision 1.93 diff -u -p -r1.93 kern_prot.c --- kern_prot.c 2001/06/06 13:58:03 1.93 +++ kern_prot.c 2001/06/27 10:11:17 @@ -689,7 +687,7 @@ setgroups(p, uap) * have the egid in the groups[0]). We risk security holes * when running non-BSD software if we do not do the same. */ - newcred->cr_ngroups = 1; + newcred->cr_ngroups = 0; } else { if ((error = copyin((caddr_t)uap->gidset, (caddr_t)newcred->cr_groups, ngrp * sizeof(gid_t)))) { -- Ruslan Ermilov Oracle Developer/DBA, ru@sunbay.com Sunbay Software AG, ru@FreeBSD.org FreeBSD committer, +380.652.512.251 Simferopol, Ukraine http://www.FreeBSD.org The Power To Serve http://www.oracle.com Enabling The Information Age To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 3:58:44 2001 Delivered-To: freebsd-arch@freebsd.org Received: from ringworld.nanolink.com (ringworld.nanolink.com [195.24.48.13]) by hub.freebsd.org (Postfix) with SMTP id 16C6137B405 for ; Wed, 27 Jun 2001 03:58:38 -0700 (PDT) (envelope-from roam@orbitel.bg) Received: (qmail 23242 invoked by uid 1000); 27 Jun 2001 11:03:22 -0000 Date: Wed, 27 Jun 2001 14:03:22 +0300 From: Peter Pentchev To: arch@FreeBSD.org Cc: audit@FreeBSD.org, freebsd-standards@bostonradio.org Subject: Re: patch for '%lld' handling in *scanf(3) Message-ID: <20010627140322.C19162@ringworld.oblivion.bg> Mail-Followup-To: arch@FreeBSD.org, audit@FreeBSD.org, freebsd-standards@bostonradio.org References: <20010623151310.A497@ringworld.oblivion.bg> <20010623160748.C497@ringworld.oblivion.bg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010623160748.C497@ringworld.oblivion.bg>; from roam@orbitel.bg on Sat, Jun 23, 2001 at 04:07:48PM +0300 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Jun 23, 2001 at 04:07:48PM +0300, Peter Pentchev wrote: > On Sat, Jun 23, 2001 at 03:13:10PM +0300, Peter Pentchev wrote: > > Hi, > > > > scanf(3) does not understand %lld for 'long long', it only understands > > %qd, and it treats %lld as plain %ld. printf(3) prints out %lld just fine. > > The fix needed is just three lines of code, which have been in both NetBSD > > and OpenBSD for some time. > [snip] > > The patch is attached. > > > > OK, so maybe this patch is not quite semantically correct; it tends > > to assume that 'long long' is the same as 'quad', or at least, that > > the programmer asked for 'quad' by using %lld. A 'real' fix would > > be defining a LONGLONG flag for scanf(). > > Well, here's a patch that implements %lld the proper way :) Somebody told me in private mail that this change should be accompanied by an update to the scanf(3) manual page. At a quick look at the manual page, the 'q' modifier is documented as providing a 'long long int' value. Thus, it seems that 'q' and 'll' should be equivalent, if 'q' is documented as doing exactly what 'll' should do. What to do now? Use my first patch (with 'll' adding QUAD to the flag), or define the new LONGLONG type? If the latter, should 'q' retain the QUAD type, or use the LONGLONG type? In any case, how should the manpage be worded (WRT both 'll' and 'q')? FWIW, here's a chart of the current situation in the several BSD's: LONGLONG flag QUAD flag %lld type %qd type FreeBSD no no none quad_t OpenBSD no yes quad quad_t NetBSD yes yes long long quad_t All the manpages document 'q' as producing a 'long long', and none of the manpages mention 'll'. G'luck, Peter (in a state of utter confuzzlement) -- No language can express every thought unambiguously, least of all this one. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 5:57:29 2001 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id 1148B37B405; Wed, 27 Jun 2001 05:57:23 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id WAA06846; Wed, 27 Jun 2001 22:57:15 +1000 Date: Wed, 27 Jun 2001 22:55:20 +1000 (EST) From: Bruce Evans X-Sender: bde@besplex.bde.org To: Peter Pentchev Cc: arch@FreeBSD.ORG, audit@FreeBSD.ORG, freebsd-standards@bostonradio.org Subject: Re: patch for '%lld' handling in *scanf(3) In-Reply-To: <20010627140322.C19162@ringworld.oblivion.bg> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 27 Jun 2001, Peter Pentchev wrote: > Somebody told me in private mail that this change should be accompanied > by an update to the scanf(3) manual page. At a quick look at the manual > page, the 'q' modifier is documented as providing a 'long long int' value. This is a bug IMO. 'q' provides a quad_t value (but see below). > Thus, it seems that 'q' and 'll' should be equivalent, if 'q' is documented > as doing exactly what 'll' should do. 'll' provides a 'long long' value, although it is currently (mis)implemented by type punning long iongs to quad_t's. > What to do now? Use my first patch (with 'll' adding QUAD to the flag), > or define the new LONGLONG type? If the latter, should 'q' retain the QUAD > type, or use the LONGLONG type? In any case, how should the manpage > be worded (WRT both 'll' and 'q')? Use your second patch. Also, implement it right by not type punning long longs as quads or using strtoq() to parse them (use strtoll()). Maybe fix the longstanding breakage of overflow handling from misusing strtoq() instead of strtol() to parse long values while you are there. > FWIW, here's a chart of the current situation in the several BSD's: > > LONGLONG flag QUAD flag %lld type %qd type > > FreeBSD no no none quad_t > OpenBSD no yes quad quad_t > NetBSD yes yes long long quad_t This oversimplifes things :-). %qd is for quad_t's, but gcc's format checker thinks that it is for long longs (at least for printf, and I think scanf is no different here. Since quad_t's are plain longs on some machines (alphas), %qd is unusable in practice (in code that must compile with WARNS=2, etc.). > All the manpages document 'q' as producing a 'long long', and none > of the manpages mention 'll'. The FreeBSD printf.3 documents this correctly. quad_t and %q should go away when C99's intmax_t and %j become Normal. I think it is time to deprecate them in man pages. (%q is already deprecated in the kernel by not permitting it in FreeBSD's version of gcc's format checker for gcc -fformat-extensions.) long long and %ll unfortunately won't go away, but using them will usually be wrong. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 6:28:47 2001 Delivered-To: freebsd-arch@freebsd.org Received: from ringworld.nanolink.com (ringworld.nanolink.com [195.24.48.13]) by hub.freebsd.org (Postfix) with SMTP id A98BA37B405 for ; Wed, 27 Jun 2001 06:28:34 -0700 (PDT) (envelope-from roam@orbitel.bg) Received: (qmail 61796 invoked by uid 1000); 27 Jun 2001 13:33:10 -0000 Date: Wed, 27 Jun 2001 16:33:10 +0300 From: Peter Pentchev To: Bruce Evans Cc: arch@FreeBSD.ORG, audit@FreeBSD.ORG, freebsd-standards@bostonradio.org Subject: Re: patch for '%lld' handling in *scanf(3) Message-ID: <20010627163310.G19162@ringworld.oblivion.bg> Mail-Followup-To: Bruce Evans , arch@FreeBSD.ORG, audit@FreeBSD.ORG, freebsd-standards@bostonradio.org References: <20010627140322.C19162@ringworld.oblivion.bg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from bde@zeta.org.au on Wed, Jun 27, 2001 at 10:55:20PM +1000 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jun 27, 2001 at 10:55:20PM +1000, Bruce Evans wrote: > On Wed, 27 Jun 2001, Peter Pentchev wrote: > > > Somebody told me in private mail that this change should be accompanied > > by an update to the scanf(3) manual page. At a quick look at the manual > > page, the 'q' modifier is documented as providing a 'long long int' value. > > This is a bug IMO. 'q' provides a quad_t value (but see below). > > > Thus, it seems that 'q' and 'll' should be equivalent, if 'q' is documented > > as doing exactly what 'll' should do. > > 'll' provides a 'long long' value, although it is currently (mis)implemented > by type punning long iongs to quad_t's. > > > What to do now? Use my first patch (with 'll' adding QUAD to the flag), > > or define the new LONGLONG type? If the latter, should 'q' retain the QUAD > > type, or use the LONGLONG type? In any case, how should the manpage > > be worded (WRT both 'll' and 'q')? > > Use your second patch. Also, implement it right by not type punning > long longs as quads or using strtoq() to parse them (use strtoll()). > Maybe fix the longstanding breakage of overflow handling from misusing > strtoq() instead of strtol() to parse long values while you are there. Eep.. ok, I completely forgot about strtoll(). > > FWIW, here's a chart of the current situation in the several BSD's: > > > > LONGLONG flag QUAD flag %lld type %qd type > > > > FreeBSD no no none quad_t > > OpenBSD no yes quad quad_t > > NetBSD yes yes long long quad_t > > This oversimplifes things :-). %qd is for quad_t's, but gcc's format > checker thinks that it is for long longs (at least for printf, and I > think scanf is no different here. Since quad_t's are plain longs > on some machines (alphas), %qd is unusable in practice (in code that > must compile with WARNS=2, etc.). Yes, I didn't say it was complete, I didn't say any of it was correct, it was just a look at lib/lib/stdio/{scanf.3,vfscanf.c} in the various BSD's :) > > All the manpages document 'q' as producing a 'long long', and none > > of the manpages mention 'll'. > > The FreeBSD printf.3 documents this correctly. > > quad_t and %q should go away when C99's intmax_t and %j become Normal. > I think it is time to deprecate them in man pages. (%q is already > deprecated in the kernel by not permitting it in FreeBSD's version of > gcc's format checker for gcc -fformat-extensions.) long long and %ll > unfortunately won't go away, but using them will usually be wrong. Hmm maybe I should spend some more time on this, and at least implement %j? NetBSD has implemented some C99 extensions already.. G'luck, Peter -- Thit sentence is not self-referential because "thit" is not a word. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 10:30:32 2001 Delivered-To: freebsd-arch@freebsd.org Received: from snipe.mail.pas.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by hub.freebsd.org (Postfix) with ESMTP id 7E9FB37B406 for ; Wed, 27 Jun 2001 10:30:29 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.245.141.189.Dial1.SanJose1.Level3.net [209.245.141.189]) by snipe.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id KAA11943; Wed, 27 Jun 2001 10:30:26 -0700 (PDT) Message-ID: <3B3A1852.3C0027EC@mindspring.com> Date: Wed, 27 Jun 2001 10:30:58 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Dima Dorfman Cc: arch@FreeBSD.ORG Subject: Re: Peer credentials on a Unix domain socket References: <20010627070628.AB5F13E2F@bazooka.unixfreak.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Dima Dorfman wrote: > Currently, there is no reliable way for a server listening on a Unix > domain socket to find out the credentials of its peer until the peer > sends something over the socket. Finding its credentials can be > useful if the server only wants to accept connections from certain > users. We already have SCM_CREDS, which will send the peer's > credentials along with a message, but this is *not* sufficient as it > may be unacceptable for the server to wait until the peer sends > something; think of DoS attacked. Times don't help, either; think of > SYN flood-like attacks. It would be useful if this were more general than you are making it. In particular, it would be useful to provide the ability to have a daemon that would sit on a FIFO, and then when people make requests to "connect" (or "bind" or even "socket"), to administratively deny the request and have their system call return EADMIN. The request would be sent up the FIFO only if there were a listenener, and would, of course, be capable of timing out. This is the same local credentials check you appear to want to do, but it must be extended, since there would be an in kernel proxy acting as a "man in the middle". Consider a dialup gateway, which wants to permit some traffic to bring the link up, but wants to stop other traffic before it becomes "demand". This can't be done by port, since you may wish to permit one application or user ID to bring the link up as a result of a DNS requests, but not another (e.g. sendmail vs. IRC vs. HTTP). This also means that I would need to be able to set a "demand source" as part of my credential, not just use the credentials raw. Other than your uipc_ctloutput() function, which seems the wrong name, and the lack of generality in the function for future expansion (e.g. no "switch" statement), this looks like a good start on something that could be more generally useful. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 12:20:39 2001 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 9801337B405 for ; Wed, 27 Jun 2001 12:20:32 -0700 (PDT) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.3/8.11.3) with SMTP id f5RJJqf17174; Wed, 27 Jun 2001 15:19:56 -0400 (EDT) (envelope-from robert@fledge.watson.org) Date: Wed, 27 Jun 2001 15:19:52 -0400 (EDT) From: Robert Watson X-Sender: robert@fledge.watson.org To: Dima Dorfman Cc: arch@freebsd.org Subject: Re: Peer credentials on a Unix domain socket In-Reply-To: <20010627070628.AB5F13E2F@bazooka.unixfreak.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG How does this solution compare with similar solutions on other platforms? Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services On Wed, 27 Jun 2001, Dima Dorfman wrote: > Hi folks, > > Currently, there is no reliable way for a server listening on a Unix > domain socket to find out the credentials of its peer until the peer > sends something over the socket. Finding its credentials can be > useful if the server only wants to accept connections from certain > users. We already have SCM_CREDS, which will send the peer's > credentials along with a message, but this is *not* sufficient as it > may be unacceptable for the server to wait until the peer sends > something; think of DoS attacked. Times don't help, either; think of > SYN flood-like attacks. > > I would like to propose implementing such a facility as a socket > option, LOCAL_PEERCRED. The payload would be am xucred structure with > the effective credentials of the connect(2) caller. Granted these may > not be the credentials of the process using the socket (think > descriptor passing), but it doesn't matter; if a process hands a > descriptor off to something else, it should be trusting it not to > abuse it (this is a feature: think of opening a privileged port and > dropping privileges). > > This has been discussed at least twice before, and nobody has a better > idea. Again, I would like to stress the two requirements: (1) the > accept(2) caller must be able to reliably obtain the effective > credentials of the connect(2) caller, and (2) the accept(2) caller > must be able to do (1) without relying on the connect(2) caller to > send data (SCM_CREDS doesn't meet (2)). > > Patch attached. > > Comments? Suggestions? > > Thanks in advance, > > Dima Dorfman > dima@unixfreak.org > > > Index: sys/un.h > =================================================================== > RCS file: /stl/src/FreeBSD/src/sys/sys/un.h,v > retrieving revision 1.17 > diff -u -r1.17 un.h > --- sys/un.h 1999/12/29 04:24:49 1.17 > +++ sys/un.h 2001/06/27 06:51:18 > @@ -46,12 +46,16 @@ > char sun_path[104]; /* path name (gag) */ > }; > > +/* Socket options. */ > +#define LOCAL_PEERCRED 0x001 /* retrieve peer credentials */ > + > #ifdef _KERNEL > struct mbuf; > struct socket; > > int uipc_usrreq __P((struct socket *so, int req, struct mbuf *m, > struct mbuf *nam, struct mbuf *control)); > +int uipc_ctloutput __P((struct socket *so, struct sockopt *sopt)); > int unp_connect2 __P((struct socket *so, struct socket *so2)); > void unp_dispose __P((struct mbuf *m)); > int unp_externalize __P((struct mbuf *rights)); > Index: sys/unpcb.h > =================================================================== > RCS file: /stl/src/FreeBSD/src/sys/sys/unpcb.h,v > retrieving revision 1.11 > diff -u -r1.11 unpcb.h > --- sys/unpcb.h 2000/05/26 02:06:59 1.11 > +++ sys/unpcb.h 2001/06/27 06:51:18 > @@ -80,7 +80,14 @@ > int unp_cc; /* copy of rcv.sb_cc */ > int unp_mbcnt; /* copy of rcv.sb_mbcnt */ > unp_gen_t unp_gencnt; /* generation count of this instance */ > + int unp_flags; /* flags */ > + struct xucred unp_peercred; /* peer credentials, if applicable */ > }; > + > +/* > + * Flags in unp_flags. > + */ > +#define UNP_HAVEPC 0x001 /* unp_peercred filled in? */ > > #define sotounpcb(so) ((struct unpcb *)((so)->so_pcb)) > > Index: kern/uipc_proto.c > =================================================================== > RCS file: /stl/src/FreeBSD/src/sys/kern/uipc_proto.c,v > retrieving revision 1.21 > diff -u -r1.21 uipc_proto.c > --- kern/uipc_proto.c 1999/10/11 15:19:11 1.21 > +++ kern/uipc_proto.c 2001/06/27 06:51:18 > @@ -51,7 +51,7 @@ > > static struct protosw localsw[] = { > { SOCK_STREAM, &localdomain, 0, PR_CONNREQUIRED|PR_WANTRCVD|PR_RIGHTS, > - 0, 0, 0, 0, > + 0, 0, 0, &uipc_ctloutput, > 0, > 0, 0, 0, 0, > &uipc_usrreqs > Index: kern/uipc_usrreq.c > =================================================================== > RCS file: /stl/src/FreeBSD/src/sys/kern/uipc_usrreq.c,v > retrieving revision 1.66 > diff -u -r1.66 uipc_usrreq.c > --- kern/uipc_usrreq.c 2001/05/25 16:59:07 1.66 > +++ kern/uipc_usrreq.c 2001/06/27 06:51:18 > @@ -434,6 +434,23 @@ > uipc_send, uipc_sense, uipc_shutdown, uipc_sockaddr, > sosend, soreceive, sopoll > }; > + > +int > +uipc_ctloutput(so, sopt) > + struct socket *so; > + struct sockopt *sopt; > +{ > + struct unpcb *unp = sotounpcb(so); > + int error; > + > + if (sopt->sopt_dir == SOPT_GET && sopt->sopt_name == LOCAL_PEERCRED && > + unp->unp_flags & UNP_HAVEPC) > + error = sooptcopyout(sopt, &unp->unp_peercred, > + sizeof(unp->unp_peercred)); > + else > + error = EOPNOTSUPP; > + return (error); > +} > > /* > * Both send and receive buffers are allocated PIPSIZ bytes of buffering > @@ -654,6 +671,12 @@ > unp3->unp_addr = (struct sockaddr_un *) > dup_sockaddr((struct sockaddr *) > unp2->unp_addr, 1); > + bzero(&unp3->unp_peercred, sizeof(unp3->unp_peercred)); > + unp3->unp_peercred.cr_uid = p->p_ucred->cr_uid; > + unp3->unp_peercred.cr_ngroups = p->p_ucred->cr_ngroups; > + bcopy(p->p_ucred->cr_groups, unp3->unp_peercred.cr_groups, > + sizeof(unp3->unp_peercred.cr_groups)); > + unp3->unp_flags |= UNP_HAVEPC; > so2 = so3; > } > error = unp_connect2(so, so2); > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 12:51:58 2001 Delivered-To: freebsd-arch@freebsd.org Received: from bazooka.unixfreak.org (bazooka.unixfreak.org [63.198.170.138]) by hub.freebsd.org (Postfix) with ESMTP id 77E6737B401 for ; Wed, 27 Jun 2001 12:51:53 -0700 (PDT) (envelope-from dima@unixfreak.org) Received: from hornet.unixfreak.org (hornet [63.198.170.140]) by bazooka.unixfreak.org (Postfix) with ESMTP id 3A00F3E32; Wed, 27 Jun 2001 12:51:37 -0700 (PDT) To: tlambert2@mindspring.com Cc: arch@FreeBSD.ORG Subject: Re: Peer credentials on a Unix domain socket In-Reply-To: <3B3A1852.3C0027EC@mindspring.com>; from tlambert2@mindspring.com on "Wed, 27 Jun 2001 10:30:58 -0700" Date: Wed, 27 Jun 2001 12:51:37 -0700 From: Dima Dorfman Message-Id: <20010627195137.3A00F3E32@bazooka.unixfreak.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Terry Lambert writes: > Dima Dorfman wrote: > > Currently, there is no reliable way for a server listening on a Unix > > domain socket to find out the credentials of its peer until the peer > > sends something over the socket. Finding its credentials can be > > useful if the server only wants to accept connections from certain > > users. We already have SCM_CREDS, which will send the peer's > > credentials along with a message, but this is *not* sufficient as it > > may be unacceptable for the server to wait until the peer sends > > something; think of DoS attacked. Times don't help, either; think of > > SYN flood-like attacks. > > It would be useful if this were more general than you are > making it. > > In particular, it would be useful to provide the ability > to have a daemon that would sit on a FIFO, and then when > people make requests to "connect" (or "bind" or even > "socket"), to administratively deny the request and have > their system call return EADMIN. > > The request would be sent up the FIFO only if there were > a listenener, and would, of course, be capable of timing > out. > > This is the same local credentials check you appear to want > to do, but it must be extended, since there would be an > in kernel proxy acting as a "man in the middle". > > > Consider a dialup gateway, which wants to permit some > traffic to bring the link up, but wants to stop other > traffic before it becomes "demand". This can't be done > by port, since you may wish to permit one application > or user ID to bring the link up as a result of a DNS > requests, but not another (e.g. sendmail vs. IRC vs. > HTTP). > > This also means that I would need to be able to set a > "demand source" as part of my credential, not just use > the credentials raw. It sounds like what you're describing is a 'fifofw' (FIFO firewall; as compared to 'ipfw'). This may be something worth investigating, but it doesn't replace the need for a simple, reliable way to find out who the connect(2) caller is. > Other than your uipc_ctloutput() function, which seems > the wrong name, What's wrong with the name? It fills in the pr_ctloutput field in a struct protosw; I think uipc_ctloutput is quite appropriate. > and the lack of generality in the function > for future expansion (e.g. no "switch" statement), Fair enough. I'll fix that. Thanks, Dima Dorfman dima@unixfreak.org > this > looks like a good start on something that could be more > generally useful. > > -- Terry > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 12:54: 6 2001 Delivered-To: freebsd-arch@freebsd.org Received: from bazooka.unixfreak.org (bazooka.unixfreak.org [63.198.170.138]) by hub.freebsd.org (Postfix) with ESMTP id 3FDEC37B401; Wed, 27 Jun 2001 12:54:02 -0700 (PDT) (envelope-from dima@unixfreak.org) Received: from hornet.unixfreak.org (hornet [63.198.170.140]) by bazooka.unixfreak.org (Postfix) with ESMTP id D392C3E31; Wed, 27 Jun 2001 12:54:01 -0700 (PDT) To: Robert Watson Cc: arch@freebsd.org Subject: Re: Peer credentials on a Unix domain socket In-Reply-To: ; from rwatson@freebsd.org on "Wed, 27 Jun 2001 15:19:52 -0400 (EDT)" Date: Wed, 27 Jun 2001 12:54:01 -0700 From: Dima Dorfman Message-Id: <20010627195401.D392C3E31@bazooka.unixfreak.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Robert Watson writes: > > How does this solution compare with similar solutions on other platforms? NetBSD has an equivilent of our SCM_CREDS (they call it UNP_WANTCRED); I'm not aware of any similar functionality in OpenBSD; and I've been told, but haven't confirmed myself, that Linux has an SO_PEERCRED socket option which does essentially what I'm proposing (obviously it doesn't use a `struct xucred`). Dima Dorfman dima@unixfreak.org > > > Robert N M Watson FreeBSD Core Team, TrustedBSD Project > robert@fledge.watson.org NAI Labs, Safeport Network Services > > On Wed, 27 Jun 2001, Dima Dorfman wrote: > > > Hi folks, > > > > Currently, there is no reliable way for a server listening on a Unix > > domain socket to find out the credentials of its peer until the peer > > sends something over the socket. Finding its credentials can be > > useful if the server only wants to accept connections from certain > > users. We already have SCM_CREDS, which will send the peer's > > credentials along with a message, but this is *not* sufficient as it > > may be unacceptable for the server to wait until the peer sends > > something; think of DoS attacked. Times don't help, either; think of > > SYN flood-like attacks. > > > > I would like to propose implementing such a facility as a socket > > option, LOCAL_PEERCRED. The payload would be am xucred structure with > > the effective credentials of the connect(2) caller. Granted these may > > not be the credentials of the process using the socket (think > > descriptor passing), but it doesn't matter; if a process hands a > > descriptor off to something else, it should be trusting it not to > > abuse it (this is a feature: think of opening a privileged port and > > dropping privileges). > > > > This has been discussed at least twice before, and nobody has a better > > idea. Again, I would like to stress the two requirements: (1) the > > accept(2) caller must be able to reliably obtain the effective > > credentials of the connect(2) caller, and (2) the accept(2) caller > > must be able to do (1) without relying on the connect(2) caller to > > send data (SCM_CREDS doesn't meet (2)). > > > > Patch attached. > > > > Comments? Suggestions? > > > > Thanks in advance, > > > > Dima Dorfman > > dima@unixfreak.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jun 27 14:47:17 2001 Delivered-To: freebsd-arch@freebsd.org Received: from magnesium.net (toxic.magnesium.net [207.154.84.15]) by hub.freebsd.org (Postfix) with SMTP id C02FF37B403 for ; Wed, 27 Jun 2001 14:47:07 -0700 (PDT) (envelope-from jasone@magnesium.net) Received: (qmail 25713 invoked by uid 1142); 27 Jun 2001 21:47:20 -0000 Date: 27 Jun 2001 14:47:20 -0700 Date: Wed, 27 Jun 2001 14:46:41 -0700 From: Jason Evans To: Julian Elischer Cc: arch@freebsd.org Subject: Re: Updated KSEs paper Message-ID: <20010627144641.K47186@canonware.com> References: <20010622184626.B47186@canonware.com> <3B36FDB4.74C96ACB@elischer.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable User-Agent: Mutt/1.2.5i In-Reply-To: <3B36FDB4.74C96ACB@elischer.org>; from julian@elischer.org on Mon, Jun 25, 2001 at 02:00:36AM -0700 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Jun 25, 2001 at 02:00:36AM -0700, Julian Elischer wrote: >=20 > ksec_preempt(ksec_id, kse_state):=20 > The KSEC with ID ksec_id was preempted, with userland execution=20 > state ksec_state. >=20 > [..]=15 > The action of pre-empting this might write the state into the=20 > userland context storage (with a copyout()). Then the UTS wouldn't > need notification, right now, just at the next time it might=20 > make a difference, i.e. when it next goes to schedule something. > If we do it right, it will find this thread on the runnable queue > at that time without us needing to do an explicit notification. The UTS needs notification in order to be able to make fully informed scheduling decisions. However, your method for storing the state using copyout() is important, and something I'd like to hear a more detailed description of. We might be talking past each other on this... What I'm trying to say is that the UTS needs to be explicitly informed of certain events, so even though we may have cool methods of getting the KSEC state to userland, the UTS still needs to know about it. > ksec_block(ksec_id):=20 > The KSEC with ID ksec_id has blocked in the kernel.=20 >=20 > [..] > this is treated exactly like the first case. I don't think it needs a=20 > separate upcall. Again, the UTS needs notification in order to be able to make fully informed scheduling decisions. > ksec_unblock(ksec_id, kse_state):=20 > The KSEC with ID ksec_id has completed in the kernel, with with=20 > userland execution state ksec_state.=20 > =09 > [..] > I don't thik a separate upcall is always needed. On completion, the state > of the returning syscall is written into the userland context, and the= =20 > context is writen to the "completed and runnable queue". The next time= =20 > the UTS runs it adds the contents of this queue to its runnable queue, a= nd > schedules as per normal.=20 Again, the UTS needs notification in order to be able to make fully informed scheduling decisions. > The following system calls are necessary:=20 >=20 > void kse_init(struct kseu *context):=20 > Start using KSEs. context contains the necessary data for the=20 > kernel to make upcalls. This function appears to > return every time an upcall is made. Initially, there is only=20 > one KSEG (ID 0), which has a concurrency level of 1. >=20 > [..] > whenever a concurrency is added the caller must supply a different > stack (or dummy stack) for the system call to return on. > an added concurrency is in effect an added KSE. Each KSE needs > a different stack to upcall on (though the stack may be small > as it will have bounded use.) "context" includes pointers to the mailbox > that will be used by that KSE. The multiple returns of this call > will all magically have that mailbox in their hand so you=20 > can preload it with anything the UTS will need on an upcall.=20 Sounds good. > int kseg_create(void):=20 > Create a KSEG and return its KSEG ID (unique within this process),= =20 > or -1 if there is an error (resource limit exceeded). > [..] > I see this as basically an extension of the next call. > You automatically get a KSE with that KSEG so it does every thing > that creating a new KSE does, and needs the 'context' variable > that a KSE would need.=20 Hmm, okay. > int kseg_concurrency(int kseg_id, int adjust):=20 > Adjust the concurrency of the KSEG with ID kseg_id. Decrementing=20 > the concurrency to 0 destroys the KSEG, as > soon as there are no more active KSECs in the KSEG. If adjust is 0,= =20 > the KSEG is not modified, but the > concurrency is still returned. This system call returns the=20 > KSEG's instantaneous concurrency level after adjusting it.=20 >=20 > [..] > If you increase the concurrency, you have created new KSEs. They need > their own separate upcall stacks (maybe only dummy stacks but.... > In any case you need to allocate them one by one. Just setting a > concurrency to "what it is now + 2" is not going to work > because the new KSEs don;t know where to return to. Okay, then we need to split this system call into two separate ones: int kseg_concurrency_inc(int kseg_id, [mailbox for completion state]); int kseg_concurrency_dec(int kseg_id); > int kseg_bind(int kseg_id, int cpu_id):=20 > Bind the KSEG with ID kseg_id to the CPU with ID cpu_id. This=20 > system call returns the CPU ID that the KSEG is > bound to, or -1 if there is an error (invalid CPU ID, or the=20 > KSEG's concurrency is greater than 1).=20 >=20 > [..] > I think the KSEG can bind itself. Same for priority.. > no need to specify KSEG.. It's implicit. I'm dubious about manual management (in the kernel) of processor binding of KSEs. Doing this puts a number of constraints on how we reimplement the kernel scheduler that I think could make things more complex and potentially cause correctness issues. I don't think we need to argue this particular point much at the moment though; it's a decision that can be put off until later, and we can change our minds even then. =3D) We could conceivably leave CPU binding out entirely during the first pass. > [..] > We also need a 'yield' version of the usleep call. > Note that a completing syscall that is already sleeping > may reawaken the yielded KSE in order to complete > after which it will upcall again in order to let the UTS > schedule the satidfied thread. > =09 > We also need a KSE_EXIT() for when we know we don't need it any more. That means that the kseg_concurrency_dec() syscall above really needs to be: kse_destroy(kse_id kse) That in turn means that arguments in various places need to explicitly include kse IDs, whereas I had completely avoided that in the paper. I think your suggestions as to how KSEC state should be handled are pretty good, so am in general agreement with these suggestions. > I also argue with the following assertion: >=20 > "Additionally, soft processor affinity for KSEs is important=20 > to performance. KSEs are not generally bound to CPUs, so > KSEs that belong to the same KSEG can potentially compete=20 > with each other for the same processor; soft processor > affinity tends to reduce such competition, in addition to=20 > well-known benefits of processor affinity. " >=20 > I would argue that limiting (HARD LIMIT) one KSE per KSEG per processor > has no ill effects and simplifies some housekeeping. KSECs can move betw= een > KSEs in the same KSEG isn a soft-affinity manner to achieve the same thing > and being able to guarantee that the KSEs of a KSEG are never competing > for the same processro ensures that they will never pre-empt each other > which in turn simplifies soem other locking assumptions that must be made > both inthe kernel and in the UTS. (Not proven but my gut feeling). > Thus on a uniprocessor, the will only ever be as many KSEs as there are K= SEGs. > Since blocking syscalls return, this has no effect on the threading pictu= re. > There are still Multiple KSECs available. Like I said above, I don't think that this sort of manual management of KSEs in the scheduler is a good idea, for complexity and correctness issues. > In 3.6.1 You prove that we an have enough storage to store thread state of > KSECs. >=20 > I would like to suggest that it can be proven as follows: > Every user thread includes a thread control block that includes enough > storage for thread context. Since every system call is made by a thread, = and=20 > the 'context' information for the KSE on which the syscall is being made= =20 > inclides a pointer to that storage, the blocked and resuming syscalls > have that storage available to store their state. The context structures > can be of a fixed known format and include an pointer to be used in linki= ng them=20 > together in the 'completed and runnable' queue pointed to by the KSEU str= ucture > that is handed to the UTS by the upcall. Therefore, there is quaranteed > to be enough storage. I like this better. It's much easier to feel confident that it's correct. =3D) > in the section: > "3.7 Upcall parallelism=20 >=20 > This section is not yet adequately fleshed out. Issues to consider: " > [varous issues shown] >=20 > Using my scheme this is not an issue. Indeed. Very good. =3D) > "What is your scheme?" I hear you ask. >=20 > Basically in implementation if the above scheme with a few twists. >=20 > 1/ Starting a KSE (as above) gives it it's mailbox. > 2/ The KSE is only runnable on a processor on which there is no KSE from = that > KSEG > already running. It tries really hard not to shift CPUs. No other KSE > will be using that mailbox, thus no other processor in that KSEG. As mentioned (twice) above, I don't necessarily think this is a good idea. > 3/ The mailbox includes a location that the kernel will look at to find a > pointer > to the (in userspace) thread context block (KSEU?). When the UTS schedule= s a=20 > thread, it fills in this location. until then it is NULL, meaning that th= e UTS > itself is running. All the time the thread is running this pointer os val= id > so even if the thread is pre-empted, without warning by the kernel, the > pointer can be used to store it's state. > 4/ When a process is blocked and an upcall happens, the kernel zero's out= =20 > that location, and takes a copy of it in teh KSEC that stores the syscall= state. > 5/ When a syscall is continued, and completes, the location given above > (which was stored along with the sleeping syscall state) is used > to store the state of the returning syscall, just as if it had returned a= nd then > done=20 > a yield(). It is then linked onto a list of 'completed syscalls' held by = the > kernel. > 6/ When the next upcall into that KSEG is performed, it first > reaps all the completed syscall blocks, and hangs them > off the mailbox for the upcalling KSE in a known location.=20 > The UTS when it runs from the upcall > discovers all the completed syscalls, which, to it > look like a whole list of yield()'d threads, and puts them onto its=20 > run-queue according to the priority of each, then schedules the next > highest priority thread. Cool, I think your ideas fix almost all of the outstanding problems with the paper (and it's simpler too!). We still have a few differences of opinion, such as KSE-CPU binding, but this design is getting quite close to what I would consider ready for implementation. I'll try to update the KSE paper in a week or so to fold in your ideas and any others that come up at USENIX. Thanks, Jason > enough for now.. more on the whiteboard at USENIX.. (what you're not goi= ng? > We'll take notes, ok?) Believe me, I really wish I were there. =3D( To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jun 28 8:56:15 2001 Delivered-To: freebsd-arch@freebsd.org Received: from gw.nectar.com (gw.nectar.com [208.42.49.153]) by hub.freebsd.org (Postfix) with ESMTP id B061837B403 for ; Thu, 28 Jun 2001 08:56:12 -0700 (PDT) (envelope-from nectar@nectar.com) Received: from madman.nectar.com (madman.nectar.com [10.0.1.111]) by gw.nectar.com (Postfix) with ESMTP id 101FCAF0AB; Thu, 28 Jun 2001 10:56:12 -0500 (CDT) Received: (from nectar@localhost) by madman.nectar.com (8.11.3/8.11.3) id f5SFu9094862; Thu, 28 Jun 2001 10:56:09 -0500 (CDT) (envelope-from nectar) Date: Thu, 28 Jun 2001 10:56:09 -0500 From: "Jacques A. Vidrine" To: Dima Dorfman Cc: arch@freebsd.org Subject: Re: Peer credentials on a Unix domain socket Message-ID: <20010628105609.K30889@madman.nectar.com> References: <20010627070628.AB5F13E2F@bazooka.unixfreak.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010627070628.AB5F13E2F@bazooka.unixfreak.org>; from dima@unixfreak.org on Wed, Jun 27, 2001 at 12:06:28AM -0700 X-Url: http://www.nectar.com/ Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jun 27, 2001 at 12:06:28AM -0700, Dima Dorfman wrote: > Currently, there is no reliable way for a server listening on a Unix > domain socket to find out the credentials of its peer until the peer > sends something over the socket. > > This has been discussed at least twice before, and nobody has a better > idea. Again, I would like to stress the two requirements: (1) the > accept(2) caller must be able to reliably obtain the effective > credentials of the connect(2) caller, and (2) the accept(2) caller > must be able to do (1) without relying on the connect(2) caller to > send data (SCM_CREDS doesn't meet (2)). > > Patch attached. > > Comments? Suggestions? What possible actions could the server take upon determining the credentials of the client? Either drop the connection or go forward. Why not just create the domain socket with permissions such that only authorized clients can connect to them in the first place? I suspect you'd answer that it isn't fine-grained enough, to which I would probably suggest that maybe the application can stand a wee bit more design work :-) or that ACLs could make it fine-grained enough. Or maybe I've missed something entirely. Cheers, -- Jacques Vidrine / n@nectar.com / jvidrine@verio.net / nectar@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jun 29 0:48:50 2001 Delivered-To: freebsd-arch@freebsd.org Received: from bazooka.unixfreak.org (bazooka.unixfreak.org [63.198.170.138]) by hub.freebsd.org (Postfix) with ESMTP id F21D037B40B for ; Fri, 29 Jun 2001 00:48:46 -0700 (PDT) (envelope-from dima@unixfreak.org) Received: from hornet.unixfreak.org (hornet [63.198.170.140]) by bazooka.unixfreak.org (Postfix) with ESMTP id 884873E2F; Fri, 29 Jun 2001 00:48:46 -0700 (PDT) To: "Jacques A. Vidrine" Cc: arch@freebsd.org Subject: Re: Peer credentials on a Unix domain socket In-Reply-To: <20010628105609.K30889@madman.nectar.com>; from n@nectar.com on "Thu, 28 Jun 2001 10:56:09 -0500" Date: Fri, 29 Jun 2001 00:48:46 -0700 From: Dima Dorfman Message-Id: <20010629074846.884873E2F@bazooka.unixfreak.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Jacques A. Vidrine" writes: > On Wed, Jun 27, 2001 at 12:06:28AM -0700, Dima Dorfman wrote: > > Currently, there is no reliable way for a server listening on a Unix > > domain socket to find out the credentials of its peer until the peer > > sends something over the socket. > > > > This has been discussed at least twice before, and nobody has a better > > idea. Again, I would like to stress the two requirements: (1) the > > accept(2) caller must be able to reliably obtain the effective > > credentials of the connect(2) caller, and (2) the accept(2) caller > > must be able to do (1) without relying on the connect(2) caller to > > send data (SCM_CREDS doesn't meet (2)). > > > > Patch attached. > > > > Comments? Suggestions? > > What possible actions could the server take upon determining the > credentials of the client? Either drop the connection or go forward. > > Why not just create the domain socket with permissions such that only > authorized clients can connect to them in the first place? I suspect > you'd answer that it isn't fine-grained enough, to which I would > probably suggest that maybe the application can stand a wee bit more > design work :-) or that ACLs could make it fine-grained enough. > > Or maybe I've missed something entirely. Suppose I want to rewrite sendmail(8) so that it doesn't have to be setuid root to put outgoing mail on the queue (right now, /usr/sbin/sendmail [1] needs to be setuid to root to write to the queue; allowing anybody to write to the queue opens up other problems [2]). I intend to do this by having a privileged daemon listen on a Unix domain socket and receieve and queue mail for local users. In order to do this, I need to reliably figure out who the user submitting the message is. This can be done right now with SCM_CREDS, but since I wouldn't be able to figure out who the user is *until* they send something over the socket I open myself up to various attacks as I described in my original e-mail. Dima Dorfman dima@unixfreak.org [1] I know about mailwrapper; I chose to ignore it in this example for simplicity. [2] It may be possble to work around some of these, but it's imperfect. P.S. I don't actually plan to do what I described above. I just want to make it possible for somebody else to do it :-). To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jun 29 10:21:16 2001 Delivered-To: freebsd-arch@freebsd.org Received: from tasogare.imasy.or.jp (tasogare.imasy.or.jp [202.227.24.5]) by hub.freebsd.org (Postfix) with ESMTP id 7505937B401; Fri, 29 Jun 2001 10:21:11 -0700 (PDT) (envelope-from iwasaki@jp.FreeBSD.org) Received: from localhost (iwasaki.imasy.or.jp [202.227.24.92]) by tasogare.imasy.or.jp (8.11.3+3.4W/8.11.3/tasogare) with ESMTP/inet id f5THL5I65206; Sat, 30 Jun 2001 02:21:05 +0900 (JST) (envelope-from iwasaki@jp.FreeBSD.org) To: arch@freebsd.org Cc: audit@freebsd.org, athlete@kta.att.ne.jp, iwasaki@jp.freebsd.org Subject: Re: CFR: Crusoe LongRun Support In-Reply-To: <20010626114448O.iwasaki@jp.FreeBSD.org> References: <20010626114448O.iwasaki@jp.FreeBSD.org> X-Mailer: Mew version 1.94.1 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20010630022056B.iwasaki@jp.FreeBSD.org> Date: Sat, 30 Jun 2001 02:20:56 +0900 From: Mitsuru IWASAKI X-Dispatcher: imput version 20000228(IM140) Lines: 25 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, > I'm going to commit the patches for Transmeta Crusoe LongRun Support. > This was originally created by HATTORI-san > http://home.att.ne.jp/delta/athlete/longrun/longrun.html > as a device driver, then I made cleanups and adding sysctl interface > support, like this. > > hw.crusoe.longrun: 2 > hw.crusoe.frequency: 600 > hw.crusoe.voltage: 1600 > hw.crusoe.percentage: 100 > > Only hw.crusoe.longrun is changeable, valid values are 0, 1, 2 and 3. > > The latest patches against sys/sys/i386/i386/identcpu.c at > http://people.freebsd.org/~iwasaki/apm/sys-longrun-20010626.diff > > I'd like to have the patches reviewed in terms of sysctl namespace, > security issues and other problems. Any suggestions or objections about this? I'll commit this 3 days later. Thanks To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jun 29 11:34:36 2001 Delivered-To: freebsd-arch@freebsd.org Received: from finch-post-11.mail.demon.net (finch-post-11.mail.demon.net [194.217.242.39]) by hub.freebsd.org (Postfix) with ESMTP id 6F33637B403; Fri, 29 Jun 2001 11:34:31 -0700 (PDT) (envelope-from dfr@nlsystems.com) Received: from [62.49.251.130] (helo=herring.nlsystems.com) by finch-post-11.mail.demon.net with esmtp (Exim 2.12 #1) id 15G366-000ERv-0B; Fri, 29 Jun 2001 18:34:30 +0000 Received: from herring (herring [10.0.0.2]) by herring.nlsystems.com (8.11.2/8.11.2) with ESMTP id f5TIXE704473; Fri, 29 Jun 2001 19:33:14 +0100 (BST) (envelope-from dfr@nlsystems.com) Date: Fri, 29 Jun 2001 19:33:14 +0100 (BST) From: Doug Rabson To: Mitsuru IWASAKI Cc: , , Subject: Re: CFR: Crusoe LongRun Support In-Reply-To: <20010630022056B.iwasaki@jp.FreeBSD.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, 30 Jun 2001, Mitsuru IWASAKI wrote: > > > > I'd like to have the patches reviewed in terms of sysctl namespace, > > security issues and other problems. > > Any suggestions or objections about this? > I'll commit this 3 days later. I think you should probably commit it. I would really like to see a manpage committed at the same time which describes the implications of the various longrun values. -- Doug Rabson Mail: dfr@nlsystems.com Phone: +44 20 8348 6160 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jun 29 12:20: 0 2001 Delivered-To: freebsd-arch@freebsd.org Received: from tasogare.imasy.or.jp (tasogare.imasy.or.jp [202.227.24.5]) by hub.freebsd.org (Postfix) with ESMTP id 98DF737B403; Fri, 29 Jun 2001 12:19:54 -0700 (PDT) (envelope-from iwasaki@jp.FreeBSD.org) Received: from localhost (iwasaki.imasy.or.jp [202.227.24.92]) by tasogare.imasy.or.jp (8.11.3+3.4W/8.11.3/tasogare/smtpfeed 1.12) with ESMTP/inet id f5TJJqI86261; Sat, 30 Jun 2001 04:19:53 +0900 (JST) (envelope-from iwasaki@jp.FreeBSD.org) To: dfr@nlsystems.com Cc: iwasaki@jp.FreeBSD.org, arch@freebsd.org, audit@freebsd.org, athlete@kta.att.ne.jp Subject: Re: CFR: Crusoe LongRun Support In-Reply-To: References: <20010630022056B.iwasaki@jp.FreeBSD.org> X-Mailer: Mew version 1.94.1 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20010630041951I.iwasaki@jp.FreeBSD.org> Date: Sat, 30 Jun 2001 04:19:51 +0900 From: Mitsuru IWASAKI X-Dispatcher: imput version 20000228(IM140) Lines: 22 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, > > > I'd like to have the patches reviewed in terms of sysctl namespace, > > > security issues and other problems. > > > > Any suggestions or objections about this? > > I'll commit this 3 days later. > > I think you should probably commit it. I would really like to see a > manpage committed at the same time which describes the implications of the > various longrun values. OK, I'll do that based on Hattori-san's page on LongRun http://home.att.ne.jp/delta/athlete/longrun/longrun_e.html and add short descriptions at the last argument of sysctl_add_oid(). BTW, is sysctl(8) good place to be documented? I think we'd better to have another documentation system for the hypertrophied MIBs (like /boot/default/loader.conf for tunables, sys/i386/conf/NOTES for kernel config options). Thanks To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jun 29 12:29:30 2001 Delivered-To: freebsd-arch@freebsd.org Received: from finch-post-11.mail.demon.net (finch-post-11.mail.demon.net [194.217.242.39]) by hub.freebsd.org (Postfix) with ESMTP id 1ECC837B401; Fri, 29 Jun 2001 12:29:24 -0700 (PDT) (envelope-from dfr@nlsystems.com) Received: from [62.49.251.130] (helo=herring.nlsystems.com) by finch-post-11.mail.demon.net with esmtp (Exim 2.12 #1) id 15G3xC-000MNI-0B; Fri, 29 Jun 2001 19:29:23 +0000 Received: from herring (herring [10.0.0.2]) by herring.nlsystems.com (8.11.2/8.11.2) with ESMTP id f5TJS7704682; Fri, 29 Jun 2001 20:28:07 +0100 (BST) (envelope-from dfr@nlsystems.com) Date: Fri, 29 Jun 2001 20:28:07 +0100 (BST) From: Doug Rabson To: Mitsuru IWASAKI Cc: , , Subject: Re: CFR: Crusoe LongRun Support In-Reply-To: <20010630041951I.iwasaki@jp.FreeBSD.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, 30 Jun 2001, Mitsuru IWASAKI wrote: > Hi, > > > > > I'd like to have the patches reviewed in terms of sysctl namespace, > > > > security issues and other problems. > > > > > > Any suggestions or objections about this? > > > I'll commit this 3 days later. > > > > I think you should probably commit it. I would really like to see a > > manpage committed at the same time which describes the implications of the > > various longrun values. > > OK, I'll do that based on Hattori-san's page on LongRun > http://home.att.ne.jp/delta/athlete/longrun/longrun_e.html > and add short descriptions at the last argument of sysctl_add_oid(). > > BTW, is sysctl(8) good place to be documented? I think we'd better to > have another documentation system for the hypertrophied MIBs (like > /boot/default/loader.conf for tunables, sys/i386/conf/NOTES for kernel > config options). I don't think that sysctl(8) is really the right place since there really isn't enough space for nontrivial documentation and documenting many things in a single place doesn't really scale that well. I would like to see something like longrun(4) which described the sysctls and indicated what each longrun level actually means. -- Doug Rabson Mail: dfr@nlsystems.com Phone: +44 20 8348 6160 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jun 29 20:51:30 2001 Delivered-To: freebsd-arch@freebsd.org Received: from bazooka.unixfreak.org (bazooka.unixfreak.org [63.198.170.138]) by hub.freebsd.org (Postfix) with ESMTP id A9CD737B401 for ; Fri, 29 Jun 2001 20:51:27 -0700 (PDT) (envelope-from dima@unixfreak.org) Received: from hornet.unixfreak.org (hornet [63.198.170.140]) by bazooka.unixfreak.org (Postfix) with ESMTP id 1EAAD3E31; Fri, 29 Jun 2001 20:51:27 -0700 (PDT) To: "Jacques A. Vidrine" Cc: arch@freebsd.org Subject: Re: Peer credentials on a Unix domain socket In-Reply-To: <20010629064823.A61206@madman.nectar.com>; from n@nectar.com on "Fri, 29 Jun 2001 06:48:23 -0500" Date: Fri, 29 Jun 2001 20:51:27 -0700 From: Dima Dorfman Message-Id: <20010630035127.1EAAD3E31@bazooka.unixfreak.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Jacques A. Vidrine" writes: > On Fri, Jun 29, 2001 at 12:48:46AM -0700, Dima Dorfman wrote: > > Suppose I want to rewrite sendmail(8) so that it doesn't have to be > > setuid root to put outgoing mail on the queue (right now, > > /usr/sbin/sendmail [1] needs to be setuid to root to write to the > > queue; allowing anybody to write to the queue opens up other problems > > [2]). I intend to do this by having a privileged daemon listen on a > > Unix domain socket and receieve and queue mail for local users. In > > order to do this, I need to reliably figure out who the user > > submitting the message is. This can be done right now with SCM_CREDS, > > but since I wouldn't be able to figure out who the user is *until* > > they send something over the socket I open myself up to various > > attacks as I described in my original e-mail. > > There are two cases: either (1) anyone on the system can send mail, or > (2) only a subset of the users can send mail. > > In case (1), the Unix domain socket would be accessible by anyone. > > In case (2), add the subset of users to a common group, and make the > socket accessible only by that group. > > In either case, SCM_CREDS can be used to determine the client's > credentials. > > So I still don't quite get it. What attacks do you eliminate with the > scheme you suggest? Whether or not the credentials are available on > socket accept, or with the first message, the daemon still must wait > for a useful message from the client. Suppose Oscar wants to attack the mail system. If I use SCM_CREDS, Oscar can create umpteen connections and not send anything on them, and not only will I not know it's Oscar, but if I start refusing more connections he gets what he wants--he just successfully denied service to other users. If I can get the credentials at accept() time, I can set a flag the first time he connects that "Oscar connected"; then I can drop any following connections from Oscar knowing full well that Oscar is the only one being denied service. Everyone else is unaffected; thus, Oscar's attack is unsuccessful. Dima Dorfman dima@unixfreak.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 1: 8: 1 2001 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 3524937B405 for ; Sat, 30 Jun 2001 01:07:56 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter (localhost [127.0.0.1]) by critter.freebsd.dk (8.11.3/8.11.3) with ESMTP id f5U87in33347 for ; Sat, 30 Jun 2001 10:07:46 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: arch@freebsd.org Subject: DEVFS, devd and all that... From: Poul-Henning Kamp Date: Sat, 30 Jun 2001 10:07:44 +0200 Message-ID: <33345.993888464@critter> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Unfortunately I will not be able to attend online today, so instead you'll have to take this email as my input in the DEVFS/devd discussion. Considering the infrequency of device create events and the fact that nobody so far has come up with a good example which cannot be handled otherwise I prefer that devd does not need to be involved in all device creates. I'm also in favour of being able to run without devd on lightweight systems (picobsd, jails...) My ideas there for is this: The device driver, in make_dev() specifies: a device type ("disk", "tape", "tty", "special") a device name (as now: fd0a, ttyq3 etc) We implement a list of rules which is used to define what should happen at device create: name="ad0s1a" : mode=0600 owner=root group=wheel name="fd0*" : mode=0666 owner root group=wheel chown_on_open=yes type="disk" : mode=0640 owner=root group=operator type="pty" : mode = 0666 owner=root group=operator chown_on_open=yes type="tty" : mode = 0600 owner=root group=tty That way we have a flexible and configurable policy even running without devd and we have a means for devd to express special wishes. One particular action could be: type="ttyd*" : ask_devd=yes which would not enter the devicename into the directory, until devd has told os what to do. These rules can also be used to provide the filtering of devices in a jail: type="*" : ignore=yes We may need more than name and class though, maybe an attachment: attachment="usb" : ask_devd=yes As much as it looks like BASIC, I propose to use linenumbers to order and specify the sequence of the rules. Sets of rules have a name, and a given mountpoint can be associated with a given rule-set: mount -t devfd -o rules=jailrules devfs /home/jail01/dev Regarding "arrive" and "depart" events and devd. It has correctly been observed that we need to enforce a strict ordering to make sure we don't have a device re-arriving while devd is still dettaching the previos instance. There is no correct way to solve this problem, except not reusing unit numbers until devd releases them. Problem with that is that the driver is stuck in destroy_dev until devd (and any subprocesses) return. This obviously doesn't work in practice. In particular not if it's the disk driver getting stuck and devd tries to access the disk... One solution would be to add a new driver entrypoint so that destroy_dev() only queues the request, and all further cleanup happens when driver->cleanup() is called by devfs after devd returns ... messy. Another solution would be to simply not recycle unit numbers. This has some serious disadvantages too. The best I have come up with so far is to hide the magic in devfs we add a state called "hidden" which means that the dev_t will is not yet discovered by the filesystem side of devfs. make_dev() on parsing the rules described above finds a "ask_devd=yes" entry and sets the hidden bit on the dev_t. devd is notified and it uses some magic to "unhide" the device which makes it appear in the directories. Devd subsequently does what needs done for that device. life goes on. driver calls destroy_dev(). The dev_t is turned into a "dead_dev_t" which has a cdevsw[] which returns ENXIO or whatever is correct, the dev_t is marked "recycling" and devd is notified. If the driver calls make_dev again at this point, a new dev_t is created as per above, and chained off the dev_t in "recycling" state but otherwise not acted on and not visible unless you have the pointer to it. When devd says the "recycling" dev_t is "done", that dev_t is destroyed and the "new" dev_t hanging off it is found and we start over from above. For this to work, drivers need to be very careful to chose one "master" dev_t for each conceptual "foo" they have, and even more careful to chain other dev_t's off that dev_t. Otherwise there will be no way to remain consistent and sane during the above gyrations. The disadvantage which could require some kind of "DWIM" superuser tool is that a device could be stuck in limbo of devd gets confused or stuck. This may be a devd implementation issue. Input most welcome... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 2:40:11 2001 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id AE35C37B403 for ; Sat, 30 Jun 2001 02:39:52 -0700 (PDT) (envelope-from des@ofug.org) Received: (from des@localhost) by flood.ping.uio.no (8.9.3/8.9.3) id LAA11788; Sat, 30 Jun 2001 11:39:50 +0200 (CEST) (envelope-from des@ofug.org) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: arch@freebsd.org Subject: New kqueue filter From: Dag-Erling Smorgrav Date: 30 Jun 2001 11:39:49 +0200 Message-ID: Lines: 26 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --=-=-= Attached is a patch that partially implements a kqueue filter that waits for a particular character to show up in the input, as well as a userland test program. This could be extremely useful for writing an fgetln() equivalent with a timeout (like libfetch's _fetch_getln(), but with only a handful of syscalls instead of two or three per character read). I've written filter routines for ttys and sockets. The tty code has been tested; the socket code builds, but hasn't been tested. Similar code needs to be written for vnodes and pipes (I've started on pipes, but haven't gotten very far); this could be a good project for a junior kernel hacker. The only problem I've experienced with this patch is that it seems the filter always runs twice (even when it succeeds the first time, though it only returns to userland the second time). This is possibly a bug in the kqueue framework code. DES -- Dag-Erling Smorgrav - des@ofug.org --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=kqueue.diff Index: sys/sys/event.h =================================================================== RCS file: /home/ncvs/src/sys/sys/event.h,v retrieving revision 1.12 diff -u -r1.12 event.h --- sys/sys/event.h 2001/02/24 01:44:03 1.12 +++ sys/sys/event.h 2001/06/18 20:24:42 @@ -35,8 +35,9 @@ #define EVFILT_VNODE (-4) /* attached to vnodes */ #define EVFILT_PROC (-5) /* attached to struct proc */ #define EVFILT_SIGNAL (-6) /* attached to struct proc */ +#define EVFILT_CHAR (-7) -#define EVFILT_SYSCOUNT 6 +#define EVFILT_SYSCOUNT 7 #define EV_SET(kevp, a, b, c, d, e, f) do { \ (kevp)->ident = (a); \ Index: sys/kern/kern_event.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_event.c,v retrieving revision 1.26 diff -u -r1.26 kern_event.c --- sys/kern/kern_event.c 2001/05/01 08:12:56 1.26 +++ sys/kern/kern_event.c 2001/06/18 20:24:42 @@ -122,6 +122,7 @@ &file_filtops, /* EVFILT_VNODE */ &proc_filtops, /* EVFILT_PROC */ &sig_filtops, /* EVFILT_SIGNAL */ + &file_filtops, /* EVFILT_CHAR */ }; static int Index: sys/kern/tty.c =================================================================== RCS file: /home/ncvs/src/sys/kern/tty.c,v retrieving revision 1.153 diff -u -r1.153 tty.c --- sys/kern/tty.c 2001/05/22 22:16:18 1.153 +++ sys/kern/tty.c 2001/06/20 21:03:18 @@ -79,6 +79,7 @@ #include #endif #include +#include #define TTYDEFCHARS #include #undef TTYDEFCHARS @@ -114,6 +115,7 @@ static void filt_ttyrdetach __P((struct knote *kn)); static int filt_ttywrite __P((struct knote *kn, long hint)); static void filt_ttywdetach __P((struct knote *kn)); +static int filt_ttychar __P((struct knote *kn, long hint)); /* * Table with character classes and parity. The 8th bit indicates parity, @@ -1102,6 +1104,8 @@ { 1, NULL, filt_ttyrdetach, filt_ttyread }; static struct filterops ttywrite_filtops = { 1, NULL, filt_ttywdetach, filt_ttywrite }; +static struct filterops ttychar_filtops = + { 1, NULL, filt_ttyrdetach, filt_ttychar }; int ttykqfilter(dev, kn) @@ -1121,6 +1125,10 @@ klist = &tp->t_wsel.si_note; kn->kn_fop = &ttywrite_filtops; break; + case EVFILT_CHAR: + klist = &tp->t_rsel.si_note; + kn->kn_fop = &ttychar_filtops; + break; default: return (1); } @@ -1179,6 +1187,46 @@ return (1); return (kn->kn_data <= tp->t_olowat && ISSET(tp->t_state, TS_CONNECTED)); +} + +static int +filt_ttychar(struct knote *kn, long hint) +{ + struct tty *tp = ((dev_t)kn->kn_hook)->si_tty; + struct cblock *cbp; + intptr_t offset, limit; + u_char ch, *end, *p; + + if (tp->t_canq.c_cc == 0) + return (0); + + ch = kn->kn_sfflags; + limit = kn->kn_sdata; + + cbp = (struct cblock *)((intptr_t)tp->t_canq.c_cf & ~CROUND); + p = p = tp->t_canq.c_cf; + offset = 0; + while (cbp != NULL) { + for (end = (u_char *)(cbp + 1); p < end; ++p) { + if (limit > 0 && ++offset > limit) { + kn->kn_data = -1; + return (1); + } + if (*p == ch) { + kn->kn_data = offset; + return (1); + } + } + p = tp->t_canq.c_cf; + cbp = cbp->c_next; + } + + if (ISSET(tp->t_state, TS_ZOMBIE)) { + kn->kn_flags |= EV_EOF; + return (1); + } + + return (0); } /* Index: sys/kern/uipc_socket.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_socket.c,v retrieving revision 1.97 diff -u -r1.97 uipc_socket.c --- sys/kern/uipc_socket.c 2001/05/01 08:12:58 1.97 +++ sys/kern/uipc_socket.c 2001/06/20 21:04:28 @@ -72,6 +72,7 @@ static void filt_sowdetach(struct knote *kn); static int filt_sowrite(struct knote *kn, long hint); static int filt_solisten(struct knote *kn, long hint); +static int filt_sochar(struct knote *kn, long hint); static struct filterops solisten_filtops = { 1, NULL, filt_sordetach, filt_solisten }; @@ -79,6 +80,8 @@ { 1, NULL, filt_sordetach, filt_soread }; static struct filterops sowrite_filtops = { 1, NULL, filt_sowdetach, filt_sowrite }; +static struct filterops sochar_filtops = + { 1, NULL, filt_sordetach, filt_sochar }; struct vm_zone *socket_zone; so_gen_t so_gencnt; /* generation count for sockets */ @@ -1560,6 +1563,10 @@ kn->kn_fop = &sowrite_filtops; sb = &so->so_snd; break; + case EVFILT_CHAR: + kn->kn_fop = &sochar_filtops; + sb = &so->so_rcv; + break; default: return (1); } @@ -1644,4 +1651,51 @@ kn->kn_data = so->so_qlen - so->so_incqlen; return (! TAILQ_EMPTY(&so->so_comp)); +} + +/*ARGSUSED*/ +static int +filt_sochar(struct knote *kn, long hint) +{ + struct socket *so = (struct socket *)kn->kn_fp->f_data; + struct mbuf *mb, *nmb; + intptr_t limit, offset; + u_char ch, *buf; + int i; + + if (so->so_rcv.sb_mb == NULL || so->so_rcv.sb_cc == 0) + return (0); + + ch = kn->kn_sfflags; + limit = kn->kn_sdata; + + mb = so->so_rcv.sb_mb; + nmb = mb->m_nextpkt; + offset = 0; + while (mb != NULL) { + buf = mtod(mb, u_char *); + for (i = 0; i < mb->m_len; ++i) { + if (limit > 0 && ++offset > limit) { + kn->kn_data = -1; + return (1); + } + if (buf[i] == ch) { + kn->kn_data = offset; + return (1); + } + } + + if ((mb = mb->m_next) == NULL) { + if ((mb = nmb) != NULL) + nmb = mb->m_nextpkt; + } + } + + if (so->so_state & SS_CANTRCVMORE) { + kn->kn_flags |= EV_EOF; + kn->kn_fflags = so->so_error; + return (1); + } + + return (0); } --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: attachment; filename=kchar.c Content-Transfer-Encoding: quoted-printable /*- * Copyright (c) 2001 Dag-Erling Co=EFdan Sm=F8rgrav * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer * in this position and unchanged. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. The name of the author may not be used to endorse or promote products * derived from this software without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR ``AS IS'' AND ANY EXPRESS OR * IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. * IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT, INDIRECT, * INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF * THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. * * $FreeBSD$ */ #include #include #include #include #include #include #define MAXLINE 32 static int timeout =3D 60; int main(int argc, char *argv[]) { unsigned char line[MAXLINE]; struct timeval now, deadline; struct timespec wait; struct kevent kev; int ifd, kq, len, n, o; ifd =3D STDIN_FILENO; while ((o =3D getopt(argc, argv, "t:")) !=3D -1) switch (o) { case 't': timeout =3D atoi(optarg); break; default: err(1, "Usage: kchar [-t timeout]\n"); } =09 if ((kq =3D kqueue()) =3D=3D -1) err("kqueue()"); =09 EV_SET(&kev, ifd, EVFILT_CHAR, EV_ADD|EV_CLEAR, '\n', MAXLINE, 0); =09 wait.tv_sec =3D 0; wait.tv_nsec =3D 0; if (kevent(kq, &kev, 1, NULL, 0, &wait) =3D=3D -1) err("kevent()"); printf("Type some stuff:\n"); fflush(stdout); gettimeofday(&now, NULL); deadline =3D now; deadline.tv_sec +=3D timeout; =09 for (;;) { gettimeofday(&now, NULL); wait.tv_sec =3D deadline.tv_sec - now.tv_sec; wait.tv_nsec =3D 1000 * (deadline.tv_usec - now.tv_usec); if (wait.tv_nsec < 0) { wait.tv_nsec +=3D 1000000000; wait.tv_sec -=3D 1; } n =3D kevent(kq, NULL, 0, &kev, 1, &wait); printf("kevent() returns %d\n", n); if (n =3D=3D 0) { warnx("timeout!"); goto error; } else if (n < 0) { if (errno =3D=3D EINTR) continue; warn("kevent()"); goto error; } break; } close(kq); len =3D kev.data; if (len < 1) { printf("too many characters\n"); goto error; } else if (len > 0) { printf("found at %d!\n", len); if (read(ifd, line, len) < 0) err("read()"); line[--len] =3D '\0'; } printf("[%s] %d 0x%04x\n", line, len, kev.flags); error: while (getchar() !=3D EOF) /* nothing */ ; exit(0); } --=-=-=-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 6:15:55 2001 Delivered-To: freebsd-arch@freebsd.org Received: from mppsystems.com (mppsystems.com [208.210.148.205]) by hub.freebsd.org (Postfix) with ESMTP id 3849237B403; Sat, 30 Jun 2001 06:15:51 -0700 (PDT) (envelope-from mpp@mppsystems.com) Received: (from mpp@localhost) by mppsystems.com (8.11.3/8.11.3) id f5UDFo405810; Sat, 30 Jun 2001 08:15:50 -0500 (CDT) (envelope-from mpp) Date: Sat, 30 Jun 2001 08:15:50 -0500 From: Mike Pritchard To: Nik Clayton Cc: arch@FreeBSD.ORG Subject: Re: [PATCH] Show login(1) how to execute programs at start up Message-ID: <20010630081550.B4689@mppsystems.com> References: <20010619195223.E68877@clan.nothing-going-on.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010619195223.E68877@clan.nothing-going-on.org>; from nik@FreeBSD.ORG on Tue, Jun 19, 2001 at 07:52:23PM +0100 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Jun 19, 2001 at 07:52:23PM +0100, Nik Clayton wrote: > I want to make the new 'tips' fortunes the default for new users at > system startup. Where is the little "click this box to not see this again" box? None of the fortunes mention how to opt-out of seeing these messages. Many of the tips are very shell specific. What if the user isn't running tsch, but running sh instead? I think the tips need to be more generic, or for shell specific tips, tell the user how to determine what shell they are currently running. How non-UNIX are we going on these tips. Foe example, there is one tip about editors: "The default editor in FreeBSD is vi, which is efficient to use when you have learned it, but somewhat user-unfriendly. To use ee (an easier but less powerful editor) instead, set the environment variable EDITOR to /usr/bin/ee" Under FreeBSD, typing "edit" runs "ee", no matter what my EDITOR environment variable is set to.. When running DOS, trying to run "edit" it runs the DOS version of "edit", which is about the same level of editing as "ee" provides. Figure that a lot of people who used to / or still do run DOS will try just typing "edit [filename]". My boss is one of them. > During IRC discussion, a number of ways of doing this were knocked back > and forth. Several people voiced the opinion that this should be > selectable in some way when adding users, particularly when using either > sysinstall or adduser. > [...] > * Putting the functionality in to login.conf. > > Now you can have classes of users who will or will not receive > tips at startup. Sysinstall and adduser already know enough to > ask you about classes when adding users. Provides additional > functionality over and above what we already have. With this method, I now have to contact the system admin so I don't see these tips on startup. As someone who has done sysadmin work, I don't like this. While I've been typing this message, I've been reading through the freebsd-tips file. Some are more "user" oriented, and others are more "sysadmin" oriented. For a sysadmin example: "FreeBSD is started up by the program 'init'. The first thing init does when starting multiuser mode (ie, starting the computer up for normal use) is to run the shell script /etc/rc. By reading /etc/rc, you can learn a lot about how the system is put together, which again will make you more confident about what happens when you do something with it." I think the tips file needs to be split up somehow. Joe user doesn't really need to see the above tip. It will probably confuse him. multiuser mode? init? /etc/rc?" "I just wanted to check my e-mail!" A freebsd-tips-user and freebsd-tips-admin file would probably work. Having an option on sysinstall, to display tips everytime I use "root" (either "su" or just plain logging in as "root" would probably be a good point to display the sysadmin type tips). As I mentioned above, I think some of these tips need to be a bit more detailed. Example: "To read a compressed file without having to first uncompress it, use "zcat" or "zmore" to view it." What is a compressed file? How uneducated are we assuming our audience is? I would probably add something like the following to the above tip: "Compressed files usually have a file name that ends with ".gz", ".z". Files that that end with ".tgz" are usually compressed tar archives (see "man tar" for information about the tar command) and "man gunzip" for information on how to uncompress them." Another misleading tip is: "Need to leave your terminal for a few minutes and don't want to logout? Use "lock -p". When you return, use your password as the key to unlock the terminal." That only keeps people like my boss out. There are many other examples of poorly stated tips I could list, but I hope I got my point across -- before we start displaying these messages (by whatever means) to users/sysadmins/aliens we need to do a bit of editing. Now if you read this far, you might think I'm against this. Not at all. I just want to make sure that we are giving the right tips to the correct audience. And that we are giving them enough information that, if they follow our tips, it will lead them to the correct solution/information. And that we tell them how to opt-out of our advice. -Mike -- Mike Pritchard mpp@FreeBSD.org or mpp@mppsystems.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 6:37:12 2001 Delivered-To: freebsd-arch@freebsd.org Received: from ringworld.nanolink.com (diskworld.nanolink.com [195.24.48.189]) by hub.freebsd.org (Postfix) with SMTP id 538F237B419 for ; Sat, 30 Jun 2001 06:36:51 -0700 (PDT) (envelope-from roam@ringworld.nanolink.com) Received: (qmail 879 invoked by uid 1000); 30 Jun 2001 13:41:19 -0000 Date: Sat, 30 Jun 2001 16:41:19 +0300 From: Peter Pentchev To: Mike Pritchard Cc: Nik Clayton , arch@FreeBSD.ORG Subject: Re: [PATCH] Show login(1) how to execute programs at start up Message-ID: <20010630164118.A507@ringworld.oblivion.bg> Mail-Followup-To: Mike Pritchard , Nik Clayton , arch@FreeBSD.ORG References: <20010619195223.E68877@clan.nothing-going-on.org> <20010630081550.B4689@mppsystems.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010630081550.B4689@mppsystems.com>; from mpp@mppsystems.com on Sat, Jun 30, 2001 at 08:15:50AM -0500 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Jun 30, 2001 at 08:15:50AM -0500, Mike Pritchard wrote: > > During IRC discussion, a number of ways of doing this were knocked back > > and forth. Several people voiced the opinion that this should be > > selectable in some way when adding users, particularly when using either > > sysinstall or adduser. > > > [...] > > > * Putting the functionality in to login.conf. > > > > Now you can have classes of users who will or will not receive > > tips at startup. Sysinstall and adduser already know enough to > > ask you about classes when adding users. Provides additional > > functionality over and above what we already have. > > With this method, I now have to contact the system admin so I don't > see these tips on startup. As someone who has done sysadmin work, > I don't like this. Actually, no, you don't have to contact the sysadmin. All you need to do is put an appropriate override in your ~/.login.conf file. Of course, if this path were chosen, this override setting would have to be documented in the motd and/or the handbook. G'luck, Peter -- This sentence no verb. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 7:18:38 2001 Delivered-To: freebsd-arch@freebsd.org Received: from mppsystems.com (mppsystems.com [208.210.148.205]) by hub.freebsd.org (Postfix) with ESMTP id 161DB37B6E9; Sat, 30 Jun 2001 07:18:34 -0700 (PDT) (envelope-from mpp@mppsystems.com) Received: (from mpp@localhost) by mppsystems.com (8.11.3/8.11.3) id f5UEIX206214; Sat, 30 Jun 2001 09:18:33 -0500 (CDT) (envelope-from mpp) Date: Sat, 30 Jun 2001 09:18:33 -0500 From: Mike Pritchard To: Nik Clayton , arch@FreeBSD.ORG Subject: Re: [PATCH] Show login(1) how to execute programs at start up Message-ID: <20010630091833.A6069@mppsystems.com> References: <20010619195223.E68877@clan.nothing-going-on.org> <20010630081550.B4689@mppsystems.com> <20010630164118.A507@ringworld.oblivion.bg> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010630164118.A507@ringworld.oblivion.bg>; from roam@orbitel.bg on Sat, Jun 30, 2001 at 04:41:19PM +0300 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, Jun 30, 2001 at 04:41:19PM +0300, Peter Pentchev wrote: > On Sat, Jun 30, 2001 at 08:15:50AM -0500, Mike Pritchard wrote: > > > * Putting the functionality in to login.conf. > > > > > > Now you can have classes of users who will or will not receive > > > tips at startup. Sysinstall and adduser already know enough to > > > ask you about classes when adding users. Provides additional > > > functionality over and above what we already have. > > > > With this method, I now have to contact the system admin so I don't > > see these tips on startup. As someone who has done sysadmin work, > > I don't like this. > > Actually, no, you don't have to contact the sysadmin. All you need > to do is put an appropriate override in your ~/.login.conf file. > Of course, if this path were chosen, this override setting would > have to be documented in the motd and/or the handbook. Ah, I didn't know about that, but we still need to tell them how to do this. Either we inform them about a command they can run at the end of each tip that will disable displaying additional tips, or a pointer to a a tips(7) man page that has all of the details, at the end of each tip. I know I hate WinDoze programs that make me search through all of the menus to find the option that turns off all those pop-up tip boxes, so we should make it as easy as possible to turn these off. -Mike -- Mike Pritchard mpp@FreeBSD.org or mpp@mppsystems.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 11:38:45 2001 Delivered-To: freebsd-arch@freebsd.org Received: from tasogare.imasy.or.jp (tasogare.imasy.or.jp [202.227.24.5]) by hub.freebsd.org (Postfix) with ESMTP id E091237B8C2; Sat, 30 Jun 2001 11:14:55 -0700 (PDT) (envelope-from iwasaki@jp.FreeBSD.org) Received: from localhost (iwasaki.imasy.or.jp [202.227.24.92]) by tasogare.imasy.or.jp (8.11.3+3.4W/8.11.3/tasogare/smtpfeed 1.12) with ESMTP/inet id f5UIElI42332; Sun, 1 Jul 2001 03:14:47 +0900 (JST) (envelope-from iwasaki@jp.FreeBSD.org) To: dfr@nlsystems.com Cc: iwasaki@jp.FreeBSD.org, arch@freebsd.org, audit@freebsd.org, athlete@kta.att.ne.jp Subject: Re: CFR: Crusoe LongRun Support In-Reply-To: References: <20010630041951I.iwasaki@jp.FreeBSD.org> X-Mailer: Mew version 1.94.1 on Emacs 19.34 / Mule 2.3 (SUETSUMUHANA) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-Id: <20010701031447S.iwasaki@jp.FreeBSD.org> Date: Sun, 01 Jul 2001 03:14:47 +0900 From: Mitsuru IWASAKI X-Dispatcher: imput version 20000228(IM140) Lines: 80 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hi, > I would like to see something like longrun(4) which described the sysctls > and indicated what each longrun level actually means. OK, understood. I've written longrun(4) manpage. As always my English is poor, any feedback is welcome :-) Thanks .\" Copyright (c) 2001 Tamotsu HATTORI .\" Copyright (c) 2001 Mitsuru IWASAKI .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .\" $FreeBSD$ .Dd Jun 30, 2001 .Dt LONGRUN 4 i386 .Os .Sh NAME .Nm longrun .Nd Transmeta(TM) Crusoe(TM) LongRun(TM) support .Sh SYNOPSIS The following .Xr sysctl 8 MIBs are available: .Bl -tag -width "hw.crusoe.percentage integer no " -compact .It Sy "Name Type Changeable Description .It "hw.crusoe.longrun integer yes LongRun mode. .Bl -tag -width "0: minimum frequency mode" -compact .It "0: minimum frequency mode .It "1: power-saving mode .It "2: performance mode .It "3: maximum frequency mode .El .It "hw.crusoe.frequency integer no Current frequency (MHz). .It "hw.crusoe.voltage integer no Current voltage (mV). .It "hw.crusoe.percentage integer no Processing performance (%). .El .Pp .Sh EXAMPLES To get current status: .Bd -literal -offset indent % sysctl hw.crusoe .Ed .Pp To set LongRun mode to performance oriented variable frequency mode: .Bd -literal -offset indent # sysctl -w hw.crusoe.longrun=2 .Ed .Pp .Sh AUTHORS .An -nosplit LongRun support and this manual page were written by .An Tamotsu HATTORI Aq athlete@kta.att.ne.jp and .An Mitsuru IWASAKI Aq iwasaki@FreeBSD.org . .Sh HISTORY Transmeta(TM) Crusoe(TM) LongRun(TM) support first appeared in .Fx 5.0 . To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 13:27:18 2001 Delivered-To: freebsd-arch@freebsd.org Received: from molly.straylight.com (molly.straylight.com [204.69.232.69]) by hub.freebsd.org (Postfix) with ESMTP id E79F137B403; Sat, 30 Jun 2001 13:27:13 -0700 (PDT) (envelope-from jonathan@graehl.org) Received: from case (case.straylight.com [64.172.254.109]) by molly.straylight.com (8.11.4/8.11.4) with ESMTP id f5UKR7427646; Sat, 30 Jun 2001 13:27:08 -0700 From: "Jonathan Graehl" To: "'Dag-Erling Smorgrav'" Cc: , Subject: RE: New kqueue filter Date: Sat, 30 Jun 2001 13:29:31 -0700 Message-ID: <000301c101a3$62b04220$6dfeac40@straylight.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2605 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2462.0000 In-Reply-To: Importance: Normal Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Jonathan Lemon can speak to this better than I, but I believe this is because kevents are supposed to be level and not edge-sensitive (that is, "there is data waiting", not "data became available"). The filter would run the first time to place the event in the kqueue tentatively, and then (some time later, when some other process could have removed the event) when the user retrieves events with kevent, the filters run again to verify that the condition still exists. Perhaps it is possible to specify for the kqueue framework code whether an event should be retested (is edge or level sensitive). Admittedly, you still have to check for failure when acting on the verified events, because there would be a brief period of time between kevent and your action where a process could remove the condition (e.g. read all the data from the socket buffer) before you can. Perhaps in the future, kqueue will be streamlined and special-cased for optimum efficiency. For now, it is so far ahead of select/poll (and async I/O with queued posix real-time signals) that nobody has bothered - and perhaps kqueue is simply not used by that many of the mainstream programs, which, with their focus on portability, neglect kqueue because no equivalent facility is available on Linux/Solaris/whatever. I haven't actually examined much kernel code at all, but I did read a paper on the kqueue design and debugged a connect() kqueue/socket-error-notification bug. Feel free to correct me ;) http://www.google.com/search?q=cache:Mh5ixqePXvM:people.freebsd.org/~jle mon/kqueue.pdf+jlemon+kqueue+paper&hl=en (text version of http://people.freebsd.org/~jlemon/kqueue.pdf) (there may be a more recent paper, that was just my best guess through google.com) -- Jonathan Graehl http://jonathan.graehl.org/ > The only problem I've experienced with this patch is that it > seems the filter always runs twice (even when it succeeds the > first time, though it only returns to userland the second > time). This is possibly a bug in the kqueue framework code. > > DES > -- > Dag-Erling Smorgrav - des@ofug.org > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 16:27:28 2001 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id C3E6737B406; Sat, 30 Jun 2001 16:27:24 -0700 (PDT) (envelope-from des@ofug.org) Received: (from des@localhost) by flood.ping.uio.no (8.9.3/8.9.3) id BAA13888; Sun, 1 Jul 2001 01:27:20 +0200 (CEST) (envelope-from des@ofug.org) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: "Jonathan Graehl" Cc: , Subject: Re: New kqueue filter References: <000301c101a3$62b04220$6dfeac40@straylight.com> From: Dag-Erling Smorgrav Date: 01 Jul 2001 01:27:19 +0200 In-Reply-To: <000301c101a3$62b04220$6dfeac40@straylight.com> Message-ID: Lines: 16 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Jonathan Graehl" writes: > Jonathan Lemon can speak to this better than I, but I believe this is > because kevents are supposed to be level and not edge-sensitive (that > is, "there is data waiting", not "data became available"). The filter > would run the first time to place the event in the kqueue tentatively, > and then (some time later, when some other process could have removed > the event) when the user retrieves events with kevent, the filters run > again to verify that the condition still exists. No, the filter function isn't called at all until I actually type something on the tty, then it runs (and succeeds) twice in a row before the userland kqueue() call returns. DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jun 30 18:30:39 2001 Delivered-To: freebsd-arch@freebsd.org Received: from molly.straylight.com (molly.straylight.com [204.69.232.69]) by hub.freebsd.org (Postfix) with ESMTP id 6EE6737B408 for ; Sat, 30 Jun 2001 18:30:36 -0700 (PDT) (envelope-from jonathan@graehl.org) Received: from case (case.straylight.com [64.172.254.109]) by molly.straylight.com (8.11.4/8.11.4) with ESMTP id f611UY429372; Sat, 30 Jun 2001 18:30:34 -0700 From: "Jonathan Graehl" To: "'Dag-Erling Smorgrav'" Cc: Subject: RE: New kqueue filter Date: Sat, 30 Jun 2001 18:32:56 -0700 Message-ID: <000001c101cd$c25728e0$6dfeac40@straylight.com> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit X-Priority: 3 (Normal) X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook, Build 10.0.2605 In-Reply-To: Importance: Normal X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2462.0000 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG That's what I meant - 1) condition is satisfied, event is added to kqueue active list 2) Some more stuff happens, after which we might not want the event to be returned to us any more (e.g.the data is read by some other process) 3) user process is woken up in kevent; kernel rescans active list to remove no-longer-active events See http://people.freebsd.org/~jlemon/kqueue_slides/sld024.htm and http://people.freebsd.org/~jlemon/kqueue_slides/sld025.htm I think the second call is debatable as it doesn't necessarily save you from races - but who knows; in some cases it may offer better correctness or performance. > -----Original Message----- > From: owner-freebsd-arch@FreeBSD.ORG > [mailto:owner-freebsd-arch@FreeBSD.ORG] On Behalf Of > Dag-Erling Smorgrav > Sent: Saturday, June 30, 2001 4:27 PM > To: Jonathan Graehl > Cc: arch@FreeBSD.ORG; jlemon@FreeBSD.ORG > Subject: Re: New kqueue filter > > > "Jonathan Graehl" writes: > > Jonathan Lemon can speak to this better than I, but I > believe this is > > because kevents are supposed to be level and not > edge-sensitive (that > > is, "there is data waiting", not "data became available"). > The filter > > would run the first time to place the event in the kqueue > tentatively, > > and then (some time later, when some other process could > have removed > > the event) when the user retrieves events with kevent, the > filters run > > again to verify that the condition still exists. > > No, the filter function isn't called at all until I actually > type something on the tty, then it runs (and succeeds) twice > in a row before the userland kqueue() call returns. > > DES > -- > Dag-Erling Smorgrav - des@ofug.org > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message