From owner-freebsd-arch@FreeBSD.ORG Sun Sep 17 21:04:35 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1FE4616A415; Sun, 17 Sep 2006 21:04:35 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6252643D70; Sun, 17 Sep 2006 21:04:27 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (45tjhvivp3m82nrj@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id k8HL4QBu090411; Sun, 17 Sep 2006 14:04:27 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id k8HL4QG3090410; Sun, 17 Sep 2006 14:04:26 -0700 (PDT) (envelope-from jmg) Date: Sun, 17 Sep 2006 14:04:26 -0700 From: John-Mark Gurney To: freebsd-arch@FreeBSD.org Message-ID: <20060917210426.GI9421@funkthat.com> Mail-Followup-To: freebsd-arch@FreeBSD.org, freebsd-current@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: freebsd-current@FreeBSD.org Subject: kqueue disable on delivery... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 17 Sep 2006 21:04:35 -0000 I have implemented a couple additional features to kqueue. These allow kqueue to be a multithreaded event delivery system that can guarantee that the event will only be active in one thread at any time. The first is EV_DOD, aka disable on delivery. When the event will be delivered to userland, the knote is marked disabled so we don't have to go through the expense of reallocing the knote each time. (Reallocation of the knote is also lock intensive, and disabling is cheap.) Even though this means that the event will only ever be active in a thread at a time, (when you're done handling the event, you reenable it), removing the event from the queue outside the event handler (say a timeout handler for the connection) poses to be a problem. If you simply close the socket, the event disappears, but then there is a race between another event being created with the same socket, and notification of the handler that you want the event to stop. In order to handle that situation, I have come up w/ EV_FORCEOS, aka FORCE ONE_SHOT. EV_ONESHOT events have the advantage that once queued, they don't care if they have been activated or not, they will be returned the next round. This means that the timeout handler can safely set EV_FORCEOS on the handler, and either if it's _DISABLED (handler running and will reenable it), or it's _ENABLED, it will get dispatched, allowing the handler to detect the EV_FORCEOS flag and teardown the connection. I have expanded a custom web server to make use of these features and so it's been stable. I believe most of the bugs are concurency bugs going from a single threaded kqueue design, to allowing multiple threads access to some of the data structures. The patch is available at: http://people.FreeBSD.org/~jmg/kqueue.dod.patch As part of the patch, I added more strict checking of the flags, so that unknown flags will return an EINVAL. Comments? Suggestions? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Wed Sep 20 06:43:00 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 571A716A415 for ; Wed, 20 Sep 2006 06:43:00 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id 77A9443D6B for ; Wed, 20 Sep 2006 06:42:59 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (rnjwfhehgl2tmk2i@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id k8K6gw2O052298 for ; Tue, 19 Sep 2006 23:42:59 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id k8K6gwNJ052297 for freebsd-arch@FreeBSD.org; Tue, 19 Sep 2006 23:42:58 -0700 (PDT) (envelope-from jmg) Date: Tue, 19 Sep 2006 23:42:58 -0700 From: John-Mark Gurney To: freebsd-arch@FreeBSD.org Message-ID: <20060920064258.GE23915@funkthat.com> Mail-Followup-To: freebsd-arch@FreeBSD.org Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: Subject: PCI VPD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2006 06:43:00 -0000 I have made a patch exposes VPD information to devices so that they don't need to have their own implementation. The patch is available at: http://people.FreeBSD.org/~jmg/vpd.sk.patch It also makes sk use the VPD interface instead of hard coding the VPD cap register and rolling it's own code... This deals w/ two of the VPD capable devices that I have, one that has valid data (an sk card), and my DViCO FusionHDTV5 Lite card that has invalid VPD data... I haven't implemented writing to VPD data yet as I don't have a card that is capable. Comments? -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Wed Sep 20 13:18:03 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E8EE216A602 for ; Wed, 20 Sep 2006 13:18:03 +0000 (UTC) (envelope-from arch608@aol.com) Received: from mx1.freebsd.org (p54ACD503.dip.t-dialin.net [84.172.213.3]) by mx1.FreeBSD.org (Postfix) with SMTP id B41CE44097 for ; Wed, 20 Sep 2006 13:15:14 +0000 (GMT) (envelope-from arch608@aol.com) From: "arch608@aol.com" To: arch@freebsd.org Message-Id: <20060920131514.B41CE44097@mx1.FreeBSD.org> Date: Wed, 20 Sep 2006 13:15:14 +0000 (GMT) MIME-Version: 1.0 Content-Type: text/plain; charset="Windows-1251" X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: Everything grows: your body, experience,3KQHTXInblfGi X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: "arch608@aol.com" List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 20 Sep 2006 13:18:04 -0000 The re$ults of @dv@nced Ga1n Pro Penis Enlargement Pills are permanent and everlasting. Using Advanced Gain Pro is as easy as 1, 2, 3 but the inches you can add to your penis is 3, 4, 5 and more. [1]http://www.sewtews.com/s/?IIJP3XZaFDTVR Mo$92RZfpK6IK rTAOoeyUsumPt Hence, a worId view @lways contains many metaphors. Does this imply that world-view construction will become a part of literature, a sort of "mythopoesis," a new mythology? The plan designed by an architect for a building, the model that a chemist makes for a molecule, the drawing that a biologist conceives of the skeleton of a species and the organigrams that a manager uses for the hierarchical structure of his enterprise are different kinds of metaphors. All symbols and models in every domain, from mathematics to lyrics, have in common that "somethin g is used as a representation of something else." Our world view represents in this way the totality. It can only be expressed by means of some language that will use many different types of symbols, signs and icons. References 1. http://www.douilme.com/h/s/?EUeweZnIZzUBp From owner-freebsd-arch@FreeBSD.ORG Fri Sep 22 10:31:11 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0AB4F16A47E for ; Fri, 22 Sep 2006 10:31:11 +0000 (UTC) (envelope-from rink@rink.nu) Received: from mx0.rink.nu (thunderstone.rink.nu [80.112.228.34]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5CC5B43D60 for ; Fri, 22 Sep 2006 10:31:04 +0000 (GMT) (envelope-from rink@rink.nu) Received: from localhost (localhost [127.0.0.1]) by mx0.rink.nu (Postfix) with ESMTP id 109261706D; Fri, 22 Sep 2006 12:31:14 +0200 (CEST) X-Virus-Scanned: amavisd-new at rink.nu Received: from mx0.rink.nu ([127.0.0.1]) by localhost (thunderstone.rink.nu [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mnGUSF0bwP+e; Fri, 22 Sep 2006 12:31:10 +0200 (CEST) Received: by mx0.rink.nu (Postfix, from userid 1000) id 929B11704E; Fri, 22 Sep 2006 12:31:10 +0200 (CEST) Date: Fri, 22 Sep 2006 12:31:10 +0200 From: Rink Springer To: arch@FreeBSD.org Message-ID: <20060922103110.GA4266@rink.nu> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="d6Gm4EdcadzBjdND" Content-Disposition: inline User-Agent: Mutt/1.5.11 Cc: roel@qsp.nl Subject: NFS+SUIDDIR problem X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2006 10:31:11 -0000 --d6Gm4EdcadzBjdND Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi everyone, At work, we are having problems migrating a local filesystem (that was mounted using -o suiddir) to a NFS server, where the filesystem is also mounted using -o suiddir. This is on a 6.1-STABLE machine If a file has been created using, say, uid1, ufs/ufs/ufs_vnops.c:ufs_makeinode() will transform this to uid2 whenever needed, as desired. However, the NFS server code nfsserver/nfs_serv.c:nfsrv_access_withgiant() will check whether the vnode's attributes match those of the user credentials (cred->cr_uid =3D=3D vattr.va_uid). As the UFS driver just transformed uid1 to uid2, the check above does not hold (as vattr.va_uid =3D=3D uid2 but cred->cr_uid =3D= =3D uid1), and thus acccess is incorrectly denied. We've devised a patch which allows any write on a MNT_SUIDDIR mounted filesystem, as long as the UID is within a certain range (settable using sysctl's). However, even though this prevents our problems, is there a better solution to this problem (eg. having the vnode remember that it was chowned and checking that field)?. Or would it be best to request our patch to be commited? Thanks, --=20 Rink P.W. Springer - http://rink.nu "When will the internet move from 64Kb max .com domains to .exe domains which can use much more memory?" - Edwin Groothuis --d6Gm4EdcadzBjdND Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.4 (FreeBSD) iD8DBQFFE7tub3O60uztv/8RAsY8AKCpQp2+GDtWyrYRb2HEjHnC9VA1ogCghKT2 veOwFcZj4B4KRCtM35+ql/s= =r2H1 -----END PGP SIGNATURE----- --d6Gm4EdcadzBjdND-- From owner-freebsd-arch@FreeBSD.ORG Fri Sep 22 12:28:08 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5E66416A407; Fri, 22 Sep 2006 12:28:08 +0000 (UTC) (envelope-from jilles@stack.nl) Received: from mx1.stack.nl (meestal.stack.nl [131.155.140.141]) by mx1.FreeBSD.org (Postfix) with ESMTP id 50F5B43D6B; Fri, 22 Sep 2006 12:28:05 +0000 (GMT) (envelope-from jilles@stack.nl) Received: from snail.stack.nl (snail.stack.nl [IPv6:2001:610:1108:5010::131]) by mx1.stack.nl (Postfix) with ESMTP id 9B8D94AF8B; Fri, 22 Sep 2006 14:28:04 +0200 (CEST) Received: by snail.stack.nl (Postfix, from userid 1677) id 691AC2288E; Fri, 22 Sep 2006 14:28:04 +0200 (CEST) Date: Fri, 22 Sep 2006 14:28:04 +0200 From: Jilles Tjoelker To: Rink Springer Message-ID: <20060922122804.GA2871@stack.nl> References: <20060922103110.GA4266@rink.nu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060922103110.GA4266@rink.nu> X-Operating-System: FreeBSD 5.5-RELEASE-p1 i386 User-Agent: Mutt/1.5.13 (2006-08-11) Cc: arch@FreeBSD.org, roel@qsp.nl Subject: Re: NFS+SUIDDIR problem X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2006 12:28:08 -0000 On Fri, Sep 22, 2006 at 12:31:10PM +0200, Rink Springer wrote: > However, the NFS server code > nfsserver/nfs_serv.c:nfsrv_access_withgiant() will check whether the > vnode's attributes match those of the user credentials (cred->cr_uid == > vattr.va_uid). As the UFS driver just transformed uid1 to uid2, the > check above does not hold (as vattr.va_uid == uid2 but cred->cr_uid == > uid1), and thus acccess is incorrectly denied. Actually, that's not what it does exactly. It first checks if the file permissions allow the access; then, if they do not, the file owner UID may get access anyway (to accomodate software that opens a file and then chmods it in a way that will deny access). > We've devised a patch which allows any write on a MNT_SUIDDIR mounted > filesystem, as long as the UID is within a certain range (settable using > sysctl's). > However, even though this prevents our problems, is there a better > solution to this problem (eg. having the vnode remember that it was > chowned and checking that field)?. Or would it be best to request our > patch to be commited? Having the vnode remember that it was chowned will break if the server reboots or the vnode is removed from memory before the write is done. The fundamental problem is that NFSv2 and NFSv3 do not have the concept of an open file. One (dirty) way would be to add an ACL entry for the creator of a SUIDDIR file. There is no clean way to get rid of the entry later, however. Also, this requires enabling ACLs on the filesystem which you may not want. -- Jilles Tjoelker From owner-freebsd-arch@FreeBSD.ORG Fri Sep 22 13:25:23 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B9D416A49E; Fri, 22 Sep 2006 13:25:23 +0000 (UTC) (envelope-from is@rambler-co.ru) Received: from yam.park.rambler.ru (yam.park.rambler.ru [81.19.64.116]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6982943DB9; Fri, 22 Sep 2006 13:25:19 +0000 (GMT) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by yam.park.rambler.ru (8.13.6/8.13.3) with ESMTP id k8MDPH6W078617; Fri, 22 Sep 2006 17:25:17 +0400 (MSD) (envelope-from is@rambler-co.ru) Date: Fri, 22 Sep 2006 17:25:37 +0400 (MSD) From: Igor Sysoev X-X-Sender: is@is.park.rambler.ru To: John-Mark Gurney In-Reply-To: <20060917210426.GI9421@funkthat.com> Message-ID: <20060922171542.G17859@is.park.rambler.ru> References: <20060917210426.GI9421@funkthat.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: kqueue disable on delivery... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2006 13:25:23 -0000 On Sun, 17 Sep 2006, John-Mark Gurney wrote: > I have implemented a couple additional features to kqueue. These allow > kqueue to be a multithreaded event delivery system that can guarantee > that the event will only be active in one thread at any time. > > The first is EV_DOD, aka disable on delivery. When the event will be > delivered to userland, the knote is marked disabled so we don't > have to go through the expense of reallocing the knote each time. > (Reallocation of the knote is also lock intensive, and disabling is > cheap.) In my opinion, it's too implementation specific flag. > Even though this means that the event will only ever be active in a > thread at a time, (when you're done handling the event, you reenable > it), removing the event from the queue outside the event handler (say > a timeout handler for the connection) poses to be a problem. If you > simply close the socket, the event disappears, but then there is a > race between another event being created with the same socket, and > notification of the handler that you want the event to stop. > > In order to handle that situation, I have come up w/ EV_FORCEOS, aka > FORCE ONE_SHOT. EV_ONESHOT events have the advantage that once queued, > they don't care if they have been activated or not, they will be returned > the next round. This means that the timeout handler can safely set > EV_FORCEOS on the handler, and either if it's _DISABLED (handler running > and will reenable it), or it's _ENABLED, it will get dispatched, allowing > the handler to detect the EV_FORCEOS flag and teardown the connection. I think it should be EVFILT_USER event, allowing to EV_SET(&kev, fd, EVFILT_USER, 0, 0, 0, udata); and the event should automatically sets the EV_ONESHOT flag internally. Igor Sysoev http://sysoev.ru/en/ From owner-freebsd-arch@FreeBSD.ORG Fri Sep 22 16:58:51 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 180FF16A403; Fri, 22 Sep 2006 16:58:51 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id A209D43D53; Fri, 22 Sep 2006 16:58:50 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (eln3rdzzyu8mwswk@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id k8MGwo0Z014737; Fri, 22 Sep 2006 09:58:50 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id k8MGwn8F014736; Fri, 22 Sep 2006 09:58:49 -0700 (PDT) (envelope-from jmg) Date: Fri, 22 Sep 2006 09:58:49 -0700 From: John-Mark Gurney To: Igor Sysoev Message-ID: <20060922165848.GS23915@funkthat.com> Mail-Followup-To: Igor Sysoev , freebsd-arch@FreeBSD.org, freebsd-current@FreeBSD.org References: <20060917210426.GI9421@funkthat.com> <20060922171542.G17859@is.park.rambler.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060922171542.G17859@is.park.rambler.ru> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: kqueue disable on delivery... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 22 Sep 2006 16:58:51 -0000 Igor Sysoev wrote this message on Fri, Sep 22, 2006 at 17:25 +0400: > On Sun, 17 Sep 2006, John-Mark Gurney wrote: > > >I have implemented a couple additional features to kqueue. These allow > >kqueue to be a multithreaded event delivery system that can guarantee > >that the event will only be active in one thread at any time. > > > >The first is EV_DOD, aka disable on delivery. When the event will be > >delivered to userland, the knote is marked disabled so we don't > >have to go through the expense of reallocing the knote each time. > >(Reallocation of the knote is also lock intensive, and disabling is > >cheap.) > > In my opinion, it's too implementation specific flag. How else are you doing to solve having multiple threads servicing the same queue at the same time? Also, Apple is planing on having a similar flag to EV_DOD, but I don't know what they are naming it.. I've tried for a while to find out, but haven't been able to... > >Even though this means that the event will only ever be active in a > >thread at a time, (when you're done handling the event, you reenable > >it), removing the event from the queue outside the event handler (say > >a timeout handler for the connection) poses to be a problem. If you > >simply close the socket, the event disappears, but then there is a > >race between another event being created with the same socket, and > >notification of the handler that you want the event to stop. > > > >In order to handle that situation, I have come up w/ EV_FORCEOS, aka > >FORCE ONE_SHOT. EV_ONESHOT events have the advantage that once queued, > >they don't care if they have been activated or not, they will be returned > >the next round. This means that the timeout handler can safely set > >EV_FORCEOS on the handler, and either if it's _DISABLED (handler running > >and will reenable it), or it's _ENABLED, it will get dispatched, allowing > >the handler to detect the EV_FORCEOS flag and teardown the connection. > > I think it should be EVFILT_USER event, allowing to > EV_SET(&kev, fd, EVFILT_USER, 0, 0, 0, udata); > and the event should automatically sets the EV_ONESHOT flag internally. I'll agree EV_FORCEOS is open for discussion, but you did see how much code it adds right? I was surprised at how small the patch was for the additional functionality.. What happens if you are in the process of tearing down udata when this happens, but you haven't gotten far enough to drop it? Then you'd have to deal w/ possible lock inversions between the timeout list and your object lock, deal w/ flags on the object and ref counts.. With _DOD and _FORCEOS, you are able to continue to not require special state flags, locks nor reference counting on your objects serviced by kqueue... I wrote this code in anticipation of supporting sun4v boxes where it'd be useful to have 32 threads (or more) servicing a single kqueue... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Sat Sep 23 07:40:22 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 56F7016A4D1; Sat, 23 Sep 2006 07:40:22 +0000 (UTC) (envelope-from is@rambler-co.ru) Received: from yam.park.rambler.ru (yam.park.rambler.ru [81.19.64.116]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4B55943D64; Sat, 23 Sep 2006 07:40:15 +0000 (GMT) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by yam.park.rambler.ru (8.13.6/8.13.3) with ESMTP id k8N7eDIm055198; Sat, 23 Sep 2006 11:40:13 +0400 (MSD) (envelope-from is@rambler-co.ru) Date: Sat, 23 Sep 2006 11:40:33 +0400 (MSD) From: Igor Sysoev X-X-Sender: is@is.park.rambler.ru To: John-Mark Gurney In-Reply-To: <20060922165848.GS23915@funkthat.com> Message-ID: <20060923105426.B20782@is.park.rambler.ru> References: <20060917210426.GI9421@funkthat.com> <20060922171542.G17859@is.park.rambler.ru> <20060922165848.GS23915@funkthat.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: kqueue disable on delivery... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Sep 2006 07:40:23 -0000 On Fri, 22 Sep 2006, John-Mark Gurney wrote: > Igor Sysoev wrote this message on Fri, Sep 22, 2006 at 17:25 +0400: >> On Sun, 17 Sep 2006, John-Mark Gurney wrote: >> >>> I have implemented a couple additional features to kqueue. These allow >>> kqueue to be a multithreaded event delivery system that can guarantee >>> that the event will only be active in one thread at any time. >>> >>> The first is EV_DOD, aka disable on delivery. When the event will be >>> delivered to userland, the knote is marked disabled so we don't >>> have to go through the expense of reallocing the knote each time. >>> (Reallocation of the knote is also lock intensive, and disabling is >>> cheap.) >> >> In my opinion, it's too implementation specific flag. > > How else are you doing to solve having multiple threads servicing > the same queue at the same time? Also, Apple is planing on having > a similar flag to EV_DOD, but I don't know what they are naming it.. > I've tried for a while to find out, but haven't been able to... As I understand EV_DOD or EV_CLEAR|EV_DOD are like simple EV_ONESHOT, except the filter is not deleted on delivery, but is disabled skipping some in-kernel lock overhead. That's I'd named it too implementation specific. Yes, the EV_CLEAR|EV_DOD guarantees that the event will be active in one thread only at any time. But in my practice I saw there is necessity to guarantee that the socket (both events - EVFILT_READ and EVFILT_WRITE) will be active in one thread only at any time. It seems that is the reason why heavy threaded Solaris 10 event ports use the oneshot only model where a socket is deleted from port on delivery. >>> Even though this means that the event will only ever be active in a >>> thread at a time, (when you're done handling the event, you reenable >>> it), removing the event from the queue outside the event handler (say >>> a timeout handler for the connection) poses to be a problem. If you >>> simply close the socket, the event disappears, but then there is a >>> race between another event being created with the same socket, and >>> notification of the handler that you want the event to stop. >>> >>> In order to handle that situation, I have come up w/ EV_FORCEOS, aka >>> FORCE ONE_SHOT. EV_ONESHOT events have the advantage that once queued, >>> they don't care if they have been activated or not, they will be returned >>> the next round. This means that the timeout handler can safely set >>> EV_FORCEOS on the handler, and either if it's _DISABLED (handler running >>> and will reenable it), or it's _ENABLED, it will get dispatched, allowing >>> the handler to detect the EV_FORCEOS flag and teardown the connection. >> >> I think it should be EVFILT_USER event, allowing to >> EV_SET(&kev, fd, EVFILT_USER, 0, 0, 0, udata); >> and the event should automatically sets the EV_ONESHOT flag internally. > > I'll agree EV_FORCEOS is open for discussion, but you did see how much > code it adds right? I was surprised at how small the patch was for the > additional functionality.. Yes, EV_FORCEOS is small patch. However, EVFILT_USER is more generic (by the way, Solaris 10 event ports allow to send user-specific PORT_SOURCE_USER notification). Two years ago I was implementing threads for my server nginx on FreeBSD 4.x, using rfork(). In the absence of EVFILT_USER I made the condition variables using kill() and EV_SIGNAL and this user-level code may panic kernel. > What happens if you are in the process of tearing down udata when > this happens, but you haven't gotten far enough to drop it? Then > you'd have to deal w/ possible lock inversions between the timeout > list and your object lock, deal w/ flags on the object and ref counts.. > > With _DOD and _FORCEOS, you are able to continue to not require special > state flags, locks nor reference counting on your objects serviced by > kqueue... > > I wrote this code in anticipation of supporting sun4v boxes where it'd > be useful to have 32 threads (or more) servicing a single kqueue... You still need user locks to guarantee that the socket will be active in one thread only at any time. In proxy mode you still need locks to guarantee that two sockets will be active in one thread only. If you assemble your response from several proxied servers, then you need locks to guarantee that all these sockets will be active in one thread only. Igor Sysoev http://sysoev.ru/en/ From owner-freebsd-arch@FreeBSD.ORG Sat Sep 23 09:26:19 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C9F5E16A415; Sat, 23 Sep 2006 09:26:19 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id C5A8443D5C; Sat, 23 Sep 2006 09:26:17 +0000 (GMT) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id 45A8446DA8; Sat, 23 Sep 2006 05:26:17 -0400 (EDT) Date: Sat, 23 Sep 2006 10:26:17 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Max Laier In-Reply-To: <200609140253.06818.max@love2party.net> Message-ID: <20060923102438.N6562@fledge.watson.org> References: <20060913150912.J1823@fledge.watson.org> <200609140253.06818.max@love2party.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: trustedbsd-discuss@trustedbsd.org, freebsd-arch@freebsd.org Subject: Re: New in-kernel privilege API: priv(9) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Sep 2006 09:26:19 -0000 On Thu, 14 Sep 2006, Max Laier wrote: > Right now, prison_priv_check() is looking rather scary to me. If something > else wants to decide on finer granularity, alright, but in my opinion it's > easier (more obvious) to keep the "normal" information in the .h file where > the privileges are defined and described - as we are aiming for > centralization of the decision and information. On top of that the caller > could mask off ALLOW_IN_JAIL if they think it's not appropriate in a special > use case of the privilege. The attached version of the kern_jail.c diff removes all the extra commented out privileges that aren't granted, and were largely there as development scaffolding to make sure I considered all privileges. Does this seem a bit less scary? Robert N M Watson Computer Laboratory University of Cambridge --- //depot/projects/trustedbsd/base/sys/kern/kern_jail.c 2006/09/18 08:37:28 +++ //depot/projects/trustedbsd/priv/sys/kern/kern_jail.c 2006/09/19 08:03:32 @@ -8,7 +8,7 @@ */ #include -__FBSDID("$FreeBSD: src/sys/kern/kern_jail.c,v 1.52 2006/09/17 20:00:35 rwatson Exp $"); +__FBSDID("$FreeBSD: src/sys/kern/kern_jail.c,v 1.51 2005/09/28 00:30:56 csjp Exp $"); #include "opt_mac.h" @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -204,7 +205,7 @@ * a process root from one prison, but attached to the jail * of another. */ - error = suser(td); + error = priv_check(td, PRIV_JAIL_ATTACH); if (error) return (error); @@ -522,6 +523,172 @@ } } +/* + * Check with permission for a specific privilege is granted within jail. We + * have a specific list of accepted privileges; the rest are denied. + */ +int +prison_priv_check(struct ucred *cred, enum priv priv) +{ + + if (!(jailed(cred))) + return (0); + + switch (priv) { + + /* + * Allow ktrace privileges for root in jail. + */ + case PRIV_KTRACE: + + /* + * Allow jailed processes to configure audit identity and + * submit audit records (login, etc). In the future we may + * want to further refine the relationship between audit and + * jail. + */ + case PRIV_AUDIT_GETAUDIT: + case PRIV_AUDIT_SETAUDIT: + case PRIV_AUDIT_SUBMIT: + + /* + * Allow jailed processes to manipulate process UNIX + * credentials in any way they see fit. + */ + case PRIV_CRED_SETUID: + case PRIV_CRED_SETEUID: + case PRIV_CRED_SETGID: + case PRIV_CRED_SETEGID: + case PRIV_CRED_SETGROUPS: + case PRIV_CRED_SETREUID: + case PRIV_CRED_SETREGID: + case PRIV_CRED_SETRESUID: + case PRIV_CRED_SETRESGID: + + /* + * Jail implements visibility constraints already, so allow + * jailed root to override uid/gid-based constraints. + */ + case PRIV_SEEOTHERGIDS: + case PRIV_SEEOTHERUIDS: + + /* + * Jail implements inter-process debugging limits already, so + * allow jailed root various debugging privileges. + */ + case PRIV_DEBUG_DIFFCRED: + case PRIV_DEBUG_SUGID: + case PRIV_DEBUG_UNPRIV: + + /* + * Allow jail to set various resource limits and login + * properties, and for now, exceed process resource limits. + */ + case PRIV_PROC_LIMIT: + case PRIV_PROC_SETLOGIN: + case PRIV_PROC_SETRLIMIT: + + /* + * The following privileges should be granted to jail once + * implemented. + */ + /* case PRIV_IPC_READ: */ + /* case PRIV_IPC_WRITE: */ + /* case PRIV_IPC_EXEC: */ + /* case PRIV_IPC_ADMIN: */ + /* case PRIV_IPC_MSGSIZE: */ + /* case PRIV_MQ_ADMIN: */ + + /* + * Jail implements its own inter-process limits, so allow + * root processes in jail to change scheduling on other + * processes in the same jail. Likewise for signalling. + */ + case PRIV_SCHED_DIFFCRED: + case PRIV_SIGNAL_DIFFCRED: + case PRIV_SIGNAL_SUGID: + + /* + * Allow jailed processes to write to sysctls marked as jail + * writable. + */ + case PRIV_SYSCTL_WRITEJAIL: + + /* + * Allow root in jail to manage a variety of quota + * properties. Some are a bit surprising and should be + * reconsidered. + */ + case PRIV_UFS_GETQUOTA: + case PRIV_UFS_QUOTAOFF: /* XXXRW: Slightly surprising. */ + case PRIV_UFS_QUOTAON: /* XXXRW: Slightly surprising. */ + case PRIV_UFS_SETQUOTA: + case PRIV_UFS_SETUSE: /* XXXRW: Slightly surprising. */ + + /* + * Since Jail relies on chroot() to implement file system + * protections, grant many VFS privileges to root in jail. + * Be careful to exclude mount-related and NFS-related + * privileges. + */ + case PRIV_VFS_READ: + case PRIV_VFS_WRITE: + case PRIV_VFS_ADMIN: + case PRIV_VFS_EXEC: + case PRIV_VFS_LOOKUP: + case PRIV_VFS_BLOCKRESERVE: /* XXXRW: Slightly surprising. */ + case PRIV_VFS_CHFLAGS_DEV: + case PRIV_VFS_CHOWN: + case PRIV_VFS_CHROOT: + case PRIV_VFS_CLEARSUGID: + case PRIV_VFS_FCHROOT: + case PRIV_VFS_LINK: + case PRIV_VFS_SETGID: + case PRIV_VFS_STICKYFILE: + return (0); + + /* + * Depending on the global setting, allow privilege of + * setting system flags. + */ + case PRIV_VFS_SYSFLAGS: + if (jail_chflags_allowed) + return (0); + else + return (EPERM); + + /* + * Allow jailed root to bind reserved ports. + */ + case PRIV_NETINET_RESERVEDPORT: + return (0); + + /* + * Conditionally allow creating raw sockets in jail. + */ + case PRIV_NETINET_RAW: + if (jail_allow_raw_sockets) + return (0); + else + return (EPERM); + + /* + * Since jail implements its own visibility limits on netstat + * sysctls, allow getcred. This allows identd to work in + * jail. + */ + case PRIV_NETINET_GETCRED: + return (0); + + default: + /* + * In all remaining cases, deny the privilege request. This + * includes almost all network privileges, many system + * configuration privileges. + */ + return (EPERM); + } +} + static int sysctl_jail_list(SYSCTL_HANDLER_ARGS) { From owner-freebsd-arch@FreeBSD.ORG Sat Sep 23 18:57:30 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6672D16A403; Sat, 23 Sep 2006 18:57:30 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id E75C643D53; Sat, 23 Sep 2006 18:57:29 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (8racgo61w6f7sbmh@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id k8NIvTBU045693; Sat, 23 Sep 2006 11:57:29 -0700 (PDT) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id k8NIvRlf045692; Sat, 23 Sep 2006 11:57:27 -0700 (PDT) (envelope-from jmg) Date: Sat, 23 Sep 2006 11:57:27 -0700 From: John-Mark Gurney To: Igor Sysoev Message-ID: <20060923185727.GW23915@funkthat.com> Mail-Followup-To: Igor Sysoev , freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org References: <20060917210426.GI9421@funkthat.com> <20060922171542.G17859@is.park.rambler.ru> <20060922165848.GS23915@funkthat.com> <20060923105426.B20782@is.park.rambler.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20060923105426.B20782@is.park.rambler.ru> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: freebsd-current@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: kqueue disable on delivery... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Sep 2006 18:57:30 -0000 Igor Sysoev wrote this message on Sat, Sep 23, 2006 at 11:40 +0400: > On Fri, 22 Sep 2006, John-Mark Gurney wrote: > > >Igor Sysoev wrote this message on Fri, Sep 22, 2006 at 17:25 +0400: > >>On Sun, 17 Sep 2006, John-Mark Gurney wrote: > >> > >>>I have implemented a couple additional features to kqueue. These allow > >>>kqueue to be a multithreaded event delivery system that can guarantee > >>>that the event will only be active in one thread at any time. > >>> > >>>The first is EV_DOD, aka disable on delivery. When the event will be > >>>delivered to userland, the knote is marked disabled so we don't > >>>have to go through the expense of reallocing the knote each time. > >>>(Reallocation of the knote is also lock intensive, and disabling is > >>>cheap.) > >> > >>In my opinion, it's too implementation specific flag. > > > >How else are you doing to solve having multiple threads servicing > >the same queue at the same time? Also, Apple is planing on having > >a similar flag to EV_DOD, but I don't know what they are naming it.. > >I've tried for a while to find out, but haven't been able to... > > As I understand EV_DOD or EV_CLEAR|EV_DOD are like simple EV_ONESHOT, > except the filter is not deleted on delivery, but is disabled skipping > some in-kernel lock overhead. That's I'd named it too implementation > specific. > > Yes, the EV_CLEAR|EV_DOD guarantees that the event will be active > in one thread only at any time. But in my practice I saw there is > necessity to guarantee that the socket (both events - EVFILT_READ > and EVFILT_WRITE) will be active in one thread only at any time. > It seems that is the reason why heavy threaded Solaris 10 event ports > use the oneshot only model where a socket is deleted from port on delivery. Only if you need to both read and write active on the socket at once... In some/many servers, you only need one or the other, such as file transfer servers like http and ftp... > >>>Even though this means that the event will only ever be active in a > >>>thread at a time, (when you're done handling the event, you reenable > >>>it), removing the event from the queue outside the event handler (say > >>>a timeout handler for the connection) poses to be a problem. If you > >>>simply close the socket, the event disappears, but then there is a > >>>race between another event being created with the same socket, and > >>>notification of the handler that you want the event to stop. > >>> > >>>In order to handle that situation, I have come up w/ EV_FORCEOS, aka > >>>FORCE ONE_SHOT. EV_ONESHOT events have the advantage that once queued, > >>>they don't care if they have been activated or not, they will be returned > >>>the next round. This means that the timeout handler can safely set > >>>EV_FORCEOS on the handler, and either if it's _DISABLED (handler running > >>>and will reenable it), or it's _ENABLED, it will get dispatched, allowing > >>>the handler to detect the EV_FORCEOS flag and teardown the connection. > >> > >>I think it should be EVFILT_USER event, allowing to > >>EV_SET(&kev, fd, EVFILT_USER, 0, 0, 0, udata); > >>and the event should automatically sets the EV_ONESHOT flag internally. > > > >I'll agree EV_FORCEOS is open for discussion, but you did see how much > >code it adds right? I was surprised at how small the patch was for the > >additional functionality.. > > Yes, EV_FORCEOS is small patch. However, EVFILT_USER is more generic > (by the way, Solaris 10 event ports allow to send user-specific > PORT_SOURCE_USER notification). I agree EVFILT_USER would be a useful thing, but it is still different from EV_FORCEOS... Would you like to contribute some the to EVFILT_USER? I'll look at integrating it... > Two years ago I was implementing threads for my server nginx > on FreeBSD 4.x, using rfork(). In the absence of EVFILT_USER I made > the condition variables using kill() and EV_SIGNAL and this user-level > code may panic kernel. Does it still? > >What happens if you are in the process of tearing down udata when > >this happens, but you haven't gotten far enough to drop it? Then > >you'd have to deal w/ possible lock inversions between the timeout > >list and your object lock, deal w/ flags on the object and ref counts.. > > > >With _DOD and _FORCEOS, you are able to continue to not require special > >state flags, locks nor reference counting on your objects serviced by > >kqueue... > > > >I wrote this code in anticipation of supporting sun4v boxes where it'd > >be useful to have 32 threads (or more) servicing a single kqueue... > > You still need user locks to guarantee that the socket will be active > in one thread only at any time. In proxy mode you still need locks > to guarantee that two sockets will be active in one thread only. > If you assemble your response from several proxied servers, then > you need locks to guarantee that all these sockets will be active > in one thread only. As we have just found out, our target servers have different designs... Since mine is a simple http server, I will only ever read or write at a time.. I use accept filters so that I don't have to do an EVFILT_READ on the socket, and if I don't get a complete HTTP/1.x request, I reject it.. So my server only ever sets EVFILT_WRITE for the sockets... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."