From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 06:27:30 2007 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4511116A46E for ; Mon, 4 Jun 2007 06:27:30 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail35.syd.optusnet.com.au (mail35.syd.optusnet.com.au [211.29.133.51]) by mx1.freebsd.org (Postfix) with ESMTP id D38F913C4AD for ; Mon, 4 Jun 2007 06:27:29 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c220-239-235-248.carlnfd3.nsw.optusnet.com.au [220.239.235.248]) by mail35.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l546RIiB024194 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 4 Jun 2007 16:27:19 +1000 Date: Mon, 4 Jun 2007 16:27:20 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Jeff Roberson In-Reply-To: <20070601123530.B606@10.0.0.1> Message-ID: <20070604160036.N1084@besplex.bde.org> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@FreeBSD.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 06:27:30 -0000 On Fri, 1 Jun 2007, Jeff Roberson wrote: > Please grep for statclock in threadlock.diff. This removes time_lock from > statclock all together and protects the whole thing with thread_lock(). With > this change all cpus can execute statclock() concurrently with sched_smp.c. > This patch also has fixes for locking ruxagg() as well as asserts. It does > not yet protect the ru copying in exit(). I want to figure out the > synchronization issues with wait first. I don't want to get involved reviewing another large[r] patch. A bug turned up with the previously committed patches: the swapper process is now shown as having a runtime of 40-47 seconds after booting (and never changes after that), but I don't use swapping and this process has always been shown as having a runtime of 0 seconds before. The bug seems to be that proc0_post() doesn't know anything about the rusage fields in the thread struct. Until recently, it was only missing initialization of td_*ticks. Now it is missing initialization of td_runtime too, so the bug is more obvious. The rusage fields may or may not be garbage when proc0_post() runs, depending on the details of clock initialization, so they are supposed to be cleared there. The runtime before clearing used to be complete garbage since timecounters were used for the runtime and the dummy timecounter gave garbage while booting. Now 47 seconds on one of my machines is larger than the real time between init386() and proc0_post() by a factor of about 5. Bruce From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 16:41:16 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6224316A400 for ; Mon, 4 Jun 2007 16:41:16 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 2026513C4B0 for ; Mon, 4 Jun 2007 16:41:16 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 7B4468BDAC9 for ; Mon, 4 Jun 2007 18:24:31 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id raO7-wUP1wVa for ; Mon, 4 Jun 2007 18:24:30 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 873BB8BDA8F for ; Mon, 4 Jun 2007 18:24:30 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l54GOUkZ077224 for arch@freebsd.org; Mon, 4 Jun 2007 18:24:30 +0200 (CEST) (envelope-from rdivacky) Date: Mon, 4 Jun 2007 18:24:30 +0200 From: Roman Divacky To: arch@freebsd.org Message-ID: <20070604162430.GA76813@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.2.3i Cc: Subject: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 16:41:16 -0000 Hi, Starting from Linux 2.6.16, there is available so called *at syscalls. For example openat(), linkat() etc. Those syscalls are used to avoid races in threaded programs and to implement per-thread CWD. in other words they are usefull. Thats why Linux implemented them. I am currently implementing those in our Linuxulator. As suggested by Robert Watson I implemented general kern_fooat() functions and wrapped them around those kern_fooat() functions. It works ok and everything. But I want to introduce native *at syscalls for FreeBSD binaries. Hence I am here to discuss the API. My suggestion: use Linux API with some slight naming changes: syscalls: openat(), mkdirat(), mknodat(), chownat(), utimesat(), statat(), unlinkat(), renameat(), linkat(), symlinkat(), readlinkat(), chmodat(), accessat(). example of a syscall: int openat(int dirfd, char *path, int flags, int mode); ie. exactly the same API as Linux have with the exception of naming them sanely, ie. instead for example fchownat() we have chownat() because it does not operate on a fd but on a path. I am not sure about compatibility but we can always introduce a weak reference like fchownat() -> chownat(). I want to have special AT_FDCWD -100 define for "cwd" argument, just like linux. I am currently finishing my linuxulator-side work and I'd like to see this in 7.0R so please comment on the API and the idea of introducing those syscalls in FreeBSD. thank you Roman Divacky From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 16:55:02 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 542ED16A400; Mon, 4 Jun 2007 16:55:02 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 1098413C448; Mon, 4 Jun 2007 16:55:01 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.ntplx.net (8.14.1/8.14.1/NETPLEX) with ESMTP id l54Gt0VC003267; Mon, 4 Jun 2007 12:55:00 -0400 (EDT) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-3.0 (mail.ntplx.net [204.213.176.10]); Mon, 04 Jun 2007 12:55:00 -0400 (EDT) Date: Mon, 4 Jun 2007 12:55:00 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Roman Divacky In-Reply-To: <20070604162430.GA76813@freebsd.org> Message-ID: References: <20070604162430.GA76813@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 16:55:02 -0000 On Mon, 4 Jun 2007, Roman Divacky wrote: > Hi, > > Starting from Linux 2.6.16, there is available so called *at syscalls. For example > openat(), linkat() etc. Those syscalls are used to avoid races in threaded programs > and to implement per-thread CWD. in other words they are usefull. Thats why Linux > implemented them. > > I am currently implementing those in our Linuxulator. As suggested by Robert Watson > I implemented general kern_fooat() functions and wrapped them around those kern_fooat() > functions. It works ok and everything. But I want to introduce native *at syscalls > for FreeBSD binaries. Hence I am here to discuss the API. > > My suggestion: > > use Linux API with some slight naming changes: These are (unless Linux added some new interfaces) defined by POSIX. The API and behavior should be conformant with POSIX. See the POSIX spec for more info: http://www.opengroup.org/onlinepubs/009695399/toc.htm -- DE From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 17:04:45 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EE79016A468; Mon, 4 Jun 2007 17:04:44 +0000 (UTC) (envelope-from SRS0+ae66afe8e8203ad6c976+1380+infradead.org+hch@pentafluge.srs.infradead.org) Received: from pentafluge.infradead.org (pentafluge.infradead.org [213.146.154.40]) by mx1.freebsd.org (Postfix) with ESMTP id B135513C469; Mon, 4 Jun 2007 17:04:44 +0000 (UTC) (envelope-from SRS0+ae66afe8e8203ad6c976+1380+infradead.org+hch@pentafluge.srs.infradead.org) Received: from hch by pentafluge.infradead.org with local (Exim 4.63 #1 (Red Hat Linux)) id 1HvFht-000067-Pq; Mon, 04 Jun 2007 17:47:01 +0100 Date: Mon, 4 Jun 2007 17:47:01 +0100 From: Christoph Hellwig To: Roman Divacky Message-ID: <20070604164701.GA32750@infradead.org> References: <20070604162430.GA76813@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070604162430.GA76813@freebsd.org> User-Agent: Mutt/1.4.2.2i X-SRS-Rewrite: SMTP reverse-path rewritten from by pentafluge.infradead.org See http://www.infradead.org/rpr.html Cc: arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 17:04:45 -0000 On Mon, Jun 04, 2007 at 06:24:30PM +0200, Roman Divacky wrote: > ie. exactly the same API as Linux have with the exception of naming them sanely, ie. instead > for example fchownat() we have chownat() because it does not operate on a fd but on a path. > I am not sure about compatibility but we can always introduce a weak reference like fchownat() -> > chownat(). FYI: those names originate on Solaris, and are proposed for an addition to Posix/SuS. It's probably a better idea to keep them the same on all operating systems instead of creating articial differences. From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 17:27:24 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2E2F516A46B; Mon, 4 Jun 2007 17:27:24 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (hergotha.csail.mit.edu [66.92.79.170]) by mx1.freebsd.org (Postfix) with ESMTP id D747113C489; Mon, 4 Jun 2007 17:27:23 +0000 (UTC) (envelope-from wollman@hergotha.csail.mit.edu) Received: from hergotha.csail.mit.edu (localhost [127.0.0.1]) by hergotha.csail.mit.edu (8.13.8/8.13.8) with ESMTP id l54GlNov022244; Mon, 4 Jun 2007 12:47:23 -0400 (EDT) (envelope-from wollman@hergotha.csail.mit.edu) Received: (from wollman@localhost) by hergotha.csail.mit.edu (8.13.8/8.13.8/Submit) id l54GlNx8022243; Mon, 4 Jun 2007 12:47:23 -0400 (EDT) (envelope-from wollman) Date: Mon, 4 Jun 2007 12:47:23 -0400 (EDT) From: Garrett Wollman Message-Id: <200706041647.l54GlNx8022243@hergotha.csail.mit.edu> To: rdivacky@freebsd.org X-Newsgroups: mit.lcs.mail.freebsd-arch In-Reply-To: <20070604162430.GA76813@freebsd.org> References: <20070604162430.GA76813@freebsd.org> Organization: None X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-3.0 (hergotha.csail.mit.edu [127.0.0.1]); Mon, 04 Jun 2007 12:47:24 -0400 (EDT) X-Spam-Status: No, score=-0.0 required=5.0 tests=SPF_HELO_PASS,SPF_PASS autolearn=disabled version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on hergotha.csail.mit.edu Cc: freebsd-arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 17:27:24 -0000 Roman Divacky writes: >My suggestion: > >use Linux API with some slight naming changes: Please don't. These interfaces are currently being standardized in POSIX. Please follow the POSIX draft. -GAWollman -- Garrett A. Wollman | The real tragedy of human existence is not that we are wollman@csail.mit.edu| nasty by nature, but that a cruel structural asymmetry Opinions not those | grants to rare events of meanness such power to shape of MIT or CSAIL. | our history. - S.J. Gould, Ten Thousand Acts of Kindness From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 17:29:39 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8F7C816A41F for ; Mon, 4 Jun 2007 17:29:39 +0000 (UTC) (envelope-from mashtizadeh@gmail.com) Received: from nz-out-0506.google.com (nz-out-0506.google.com [64.233.162.227]) by mx1.freebsd.org (Postfix) with ESMTP id 3B9D513C483 for ; Mon, 4 Jun 2007 17:29:39 +0000 (UTC) (envelope-from mashtizadeh@gmail.com) Received: by nz-out-0506.google.com with SMTP id 14so955989nzn for ; Mon, 04 Jun 2007 10:29:38 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=VwMCRfhjXXQHZB++pztyo+W+60gy5FEKHT2BKPsMMPk5dNgt0o9PNhrlilcJGp5Qg8A42m5TyRMM/f1+MloycmvqubMREldxjb6IRIb8PQbkRcdEr9DcNh+kcUlsmSnrlnSPHqfDrX9XYcYjAQ0QYh2g14L0cHl9sAYHJItvqnA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=U2tl/g0UU3dZW20gLuLIdmlkspt0EodDUeO0Dx3JNfh6mYDyxj0KRiC8VihQ6Bea3Y+rWkabuUoyJS/3HJPsTReJy91unos5CeuLNEm+ahWQUMt9EjyMv2Ugjrb4Hi0Pt1RVxmaL9y7w2kK8mQY7SmDyPbRSbwO74oiHoG8jJis= Received: by 10.142.77.11 with SMTP id z11mr229947wfa.1180976458296; Mon, 04 Jun 2007 10:00:58 -0700 (PDT) Received: by 10.142.251.8 with HTTP; Mon, 4 Jun 2007 10:00:58 -0700 (PDT) Message-ID: <440b3e930706041000y4d472d41n7c409c6e912b57da@mail.gmail.com> Date: Mon, 4 Jun 2007 13:00:58 -0400 From: "Ali Mashtizadeh" To: "Daniel Eischen" In-Reply-To: MIME-Version: 1.0 References: <20070604162430.GA76813@freebsd.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: base64 Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: arch@freebsd.org, Roman Divacky Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 17:29:39 -0000 SGV5LAoKSXRzIG5vdCBkZWZpbmVkIGJ5IFBPU0lYLiBJIGxpa2UgdGhlIG5hbWluZyBjaGFuZ2Vz IGNvbXBhcmVkIHRvIHRoZSBsaW51eAp2ZXJzaW9ucy4KCi0tIApBbGkgTWFzaHRpemFkZWgK2LnZ hNuMINmF2LTYqtuMINiy2KfYr9mHCgpPbiA2LzQvMDcsIERhbmllbCBFaXNjaGVuIDxkZWlzY2hl bkBmcmVlYnNkLm9yZz4gd3JvdGU6Cj4KPiBPbiBNb24sIDQgSnVuIDIwMDcsIFJvbWFuIERpdmFj a3kgd3JvdGU6Cj4KPiA+IEhpLAo+ID4KPiA+IFN0YXJ0aW5nIGZyb20gTGludXggMi42LjE2LCB0 aGVyZSBpcyBhdmFpbGFibGUgc28gY2FsbGVkICphdCBzeXNjYWxscy4KPiBGb3IgZXhhbXBsZQo+ ID4gb3BlbmF0KCksIGxpbmthdCgpIGV0Yy4gVGhvc2Ugc3lzY2FsbHMgYXJlIHVzZWQgdG8gYXZv aWQgcmFjZXMgaW4KPiB0aHJlYWRlZCBwcm9ncmFtcwo+ID4gYW5kIHRvIGltcGxlbWVudCBwZXIt dGhyZWFkIENXRC4gaW4gb3RoZXIgd29yZHMgdGhleSBhcmUgdXNlZnVsbC4gVGhhdHMKPiB3aHkg TGludXgKPiA+IGltcGxlbWVudGVkIHRoZW0uCj4gPgo+ID4gSSBhbSBjdXJyZW50bHkgaW1wbGVt ZW50aW5nIHRob3NlIGluIG91ciBMaW51eHVsYXRvci4gQXMgc3VnZ2VzdGVkIGJ5Cj4gUm9iZXJ0 IFdhdHNvbgo+ID4gSSBpbXBsZW1lbnRlZCBnZW5lcmFsIGtlcm5fZm9vYXQoKSBmdW5jdGlvbnMg YW5kIHdyYXBwZWQgdGhlbSBhcm91bmQKPiB0aG9zZSBrZXJuX2Zvb2F0KCkKPiA+IGZ1bmN0aW9u cy4gSXQgd29ya3Mgb2sgYW5kIGV2ZXJ5dGhpbmcuIEJ1dCBJIHdhbnQgdG8gaW50cm9kdWNlIG5h dGl2ZQo+ICphdCBzeXNjYWxscwo+ID4gZm9yIEZyZWVCU0QgYmluYXJpZXMuIEhlbmNlIEkgYW0g aGVyZSB0byBkaXNjdXNzIHRoZSBBUEkuCj4gPgo+ID4gTXkgc3VnZ2VzdGlvbjoKPiA+Cj4gPiB1 c2UgTGludXggQVBJIHdpdGggc29tZSBzbGlnaHQgbmFtaW5nIGNoYW5nZXM6Cj4KPiBUaGVzZSBh cmUgKHVubGVzcyBMaW51eCBhZGRlZCBzb21lIG5ldyBpbnRlcmZhY2VzKSBkZWZpbmVkIGJ5IFBP U0lYLgo+IFRoZSBBUEkgYW5kIGJlaGF2aW9yIHNob3VsZCBiZSBjb25mb3JtYW50IHdpdGggUE9T SVguICBTZWUgdGhlCj4gUE9TSVggc3BlYyBmb3IgbW9yZSBpbmZvOgo+Cj4gICAgaHR0cDovL3d3 dy5vcGVuZ3JvdXAub3JnL29ubGluZXB1YnMvMDA5Njk1Mzk5L3RvYy5odG0KPgo+IC0tCj4gREUK PiBfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXwo+IGZyZWVi c2QtYXJjaEBmcmVlYnNkLm9yZyBtYWlsaW5nIGxpc3QKPiBodHRwOi8vbGlzdHMuZnJlZWJzZC5v cmcvbWFpbG1hbi9saXN0aW5mby9mcmVlYnNkLWFyY2gKPiBUbyB1bnN1YnNjcmliZSwgc2VuZCBh bnkgbWFpbCB0byAiZnJlZWJzZC1hcmNoLXVuc3Vic2NyaWJlQGZyZWVic2Qub3JnIgo+Cg== From owner-freebsd-arch@FreeBSD.ORG Mon Jun 4 18:34:20 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 71D5A16A46B; Mon, 4 Jun 2007 18:34:20 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 1BF3C13C4DD; Mon, 4 Jun 2007 18:34:19 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.ntplx.net (8.14.1/8.14.1/NETPLEX) with ESMTP id l54IYIR1019392; Mon, 4 Jun 2007 14:34:18 -0400 (EDT) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-3.0 (mail.ntplx.net [204.213.176.10]); Mon, 04 Jun 2007 14:34:18 -0400 (EDT) Date: Mon, 4 Jun 2007 14:34:18 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Ali Mashtizadeh In-Reply-To: <440b3e930706041000y4d472d41n7c409c6e912b57da@mail.gmail.com> Message-ID: References: <20070604162430.GA76813@freebsd.org> <440b3e930706041000y4d472d41n7c409c6e912b57da@mail.gmail.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, Roman Divacky Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jun 2007 18:34:20 -0000 On Mon, 4 Jun 2007, Ali Mashtizadeh wrote: > Hey, > > Its not defined by POSIX. I like the naming changes compared to the linux > versions. Actually, it's there, but it's a draft. You gotta go find it yourself and register at opengroup.org. But the point is we should be using the draft POSIX interfaces. -- DE From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 05:33:04 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4FD8916A41F; Tue, 5 Jun 2007 05:33:04 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 0CC1513C44B; Tue, 5 Jun 2007 05:33:03 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l555X0jf091176 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Tue, 5 Jun 2007 01:33:01 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Mon, 4 Jun 2007 22:32:46 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: arch@freebsd.org Message-ID: <20070604220649.E606@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, marcl@freebsd.org, jake@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 05:33:04 -0000 For every architecture we need to support a new features in cpu_switch() and cpu_throw() before they can support per-cpu schedlock. I'll describe those below. I'm soliciting help or advice in implementing these on platforms other than x86, and amd64, especially on ia64 where things are implemented in C! I checked in the new version of cpu_switch() for amd64 today after threadlock went in. Basically, we have to release a thread's lock when it's switched out and acquire a lock when it's switched in. The release must happen after we're totally done with the stack and vmspace of the thread to be switched out. On amd64 this meant after we clear the active bits for tlb shootdown. The release actually makes use of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to this argument rather than unlocking a real lock. td_lock has previously been set to the blocked lock, which is always blocked. Threads spinning in thread_lock() will notice the td_lock pointer change and acquire the new lock. So this is simple, just a non-atomic store with a pointer passed as an argument. On amd64: movq %rdx, TD_LOCK(%rdi) /* Release the old thread */ The acquire part is slightly more complicated and involves a little loop. We don't actually have to spin trying to lock the thread. We just spin until it's no longer set to the blocked lock. The switching thread already owns the per-cpu scheduler lock for the current cpu. If we're switching into a thread that is set to the blocked_lock another cpu is about to set it to our current cpu's lock via the mtx argument mentioned above. On amd64 we have: /* Wait for the new thread to become unblocked */ movq $blocked_lock, %rdx 1: movq TD_LOCK(%rsi),%rcx cmpq %rcx, %rdx je 1b So these two are actually quite simple. You can see the full patch for cpu_switch.S as the first file modified in: http://people.freebsd.org/~jeff/threadlock.diff For cpu_throw() we have to actually complete a real unlock of a spinlock. What happens here, although this isn't in cvs yet, is that thread_exit() will set the thread's lock pointer to be the per-process spinlock. This spinlock must be unlocked so that process resources can't be reclaimed by wait while a thread is executing cpu_throw(). This code on amd64 is (from memory rather than a patch): movq $MTX_UNOWNED, %rdx movq TD_LOCK(%rsi), %rsi xchgq %rdx, MTX_LOCK(%rsi) I'm hoping to have at least the cpu_throw() part done for every architecture for 7.0. This will enable me to simplify thread_exit() and not have a lot of per-scheduler/architecture workarounds. Without the cpu_switch() parts sched_4bsd will still work on an architecture. I have a per-cpu spinlock version of ULE which may replace ULE or exist along side it as sched_smp. This will only work on architectures that implement the new cpu_throw() and cpu_switch(). Consider this an official call for help with the architectures you maintain. Please let me know also if you maintain an arch that you don't mind to have temporarily broken until you implement this. Thanks, Jeff From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 09:59:31 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3BC4A16A41F for ; Tue, 5 Jun 2007 09:59:31 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from ik-out-1112.google.com (ik-out-1112.google.com [66.249.90.180]) by mx1.freebsd.org (Postfix) with ESMTP id BDBB213C46A for ; Tue, 5 Jun 2007 09:59:30 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by ik-out-1112.google.com with SMTP id c21so1121824ika for ; Tue, 05 Jun 2007 02:59:29 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=OAetP5kzXlL7T+LNEF5tuLXn/thMkOLP+qJqeNoE9Vr/5tcFhmOvuPeeCok6O/dOo4sqL9jiYjhG6/s/Aj1huPsA4g4KM1XwJ0WIS+BIXIfvyVJZ2mt6BNTo6oGGyhflzFNIeOkAKPJOyFEzshrK9ufZk9y3s5iztiG6bUsTgxU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=jXCZkmYKs13laTGM7KK5UutsZCj+tEwguuEku+HkiPb5E0/9t8vG8TF2rRzP2gyBCLT1RYUSrqKxu+EpOAETuqKs9qwVoXNdLXdfyTTXkU2rWgH13ZuPHUHBKGQ6P/q0Pde+a//788nmFaleUzCeEb6S8uH1gEt/FT/WgxKADmA= Received: by 10.78.147.6 with SMTP id u6mr2338456hud.1181035814525; Tue, 05 Jun 2007 02:30:14 -0700 (PDT) Received: from ?172.31.5.25? ( [89.97.252.178]) by mx.google.com with ESMTP id 2sm991943nfv.2007.06.05.02.30.13; Tue, 05 Jun 2007 02:30:14 -0700 (PDT) Message-ID: <46652D17.5090903@FreeBSD.org> Date: Tue, 05 Jun 2007 11:29:59 +0200 From: Attilio Rao User-Agent: Thunderbird 1.5 (X11/20060526) MIME-Version: 1.0 To: Bruce Evans References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> In-Reply-To: <20070604160036.N1084@besplex.bde.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: Attilio Rao Cc: freebsd-arch@FreeBSD.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 09:59:31 -0000 Bruce Evans wrote: > On Fri, 1 Jun 2007, Jeff Roberson wrote: > >> Please grep for statclock in threadlock.diff. This removes time_lock >> from statclock all together and protects the whole thing with >> thread_lock(). With this change all cpus can execute statclock() >> concurrently with sched_smp.c. This patch also has fixes for locking >> ruxagg() as well as asserts. It does not yet protect the ru copying >> in exit(). I want to figure out the synchronization issues with wait >> first. > > I don't want to get involved reviewing another large[r] patch. > > A bug turned up with the previously committed patches: the swapper > process is now shown as having a runtime of 40-47 seconds after > booting (and never changes after that), but I don't use swapping and > this process has always been shown as having a runtime of 0 seconds > before. > > The bug seems to be that proc0_post() doesn't know anything about the > rusage fields in the thread struct. Until recently, it was only missing > initialization of td_*ticks. Now it is missing initialization of > td_runtime too, so the bug is more obvious. Yes, I always wondered why proc0_post() doesn't initialize [s,i,u]ticks too. However, could you please give a look and a try to this patch: http://users.gufi.org/~rookie/works/patches/schedlock/proc_post.diff and see if it solves your problem. Thanks, Attilio From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 12:45:36 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 95E9F16A421; Tue, 5 Jun 2007 12:45:36 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail02.syd.optusnet.com.au (mail02.syd.optusnet.com.au [211.29.132.183]) by mx1.freebsd.org (Postfix) with ESMTP id 27CDA13C44B; Tue, 5 Jun 2007 12:45:35 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from c220-239-235-248.carlnfd3.nsw.optusnet.com.au (c220-239-235-248.carlnfd3.nsw.optusnet.com.au [220.239.235.248]) by mail02.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l55CjOYZ023069 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 5 Jun 2007 22:45:25 +1000 Date: Tue, 5 Jun 2007 22:45:26 +1000 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: Attilio Rao In-Reply-To: <46652D17.5090903@FreeBSD.org> Message-ID: <20070605214404.X47001@delplex.bde.org> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 12:45:36 -0000 On Tue, 5 Jun 2007, Attilio Rao wrote: > Bruce Evans wrote: >> ... >> The bug seems to be that proc0_post() doesn't know anything about the >> rusage fields in the thread struct. Until recently, it was only missing >> initialization of td_*ticks. Now it is missing initialization of >> td_runtime too, so the bug is more obvious. > > Yes, I always wondered why proc0_post() doesn't initialize [s,i,u]ticks too. > However, could you please give a look and a try to this patch: > http://users.gufi.org/~rookie/works/patches/schedlock/proc_post.diff > > and see if it solves your problem. This can probably be fixed more simply by calling rufetch() to reset the time state in threads as a side effect. Do this before resetting the state in the process. ANother minor problem is that proc0_post() isn't the right place to reset the time state except for proc0. It hacks on some of the time state for all processes and thus for all CPUs, but only resets switchtime and switchticks for the current CPU. Resetting of switchtime and switchticks for startup of other CPUs now seems to be in sched_throw(). I think this is not properly synchronous with the resetting of the rest of the state, but the errors from this are tiny. Bruce From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 14:30:58 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A15D816A468; Tue, 5 Jun 2007 14:30:58 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id ECB3213C447; Tue, 5 Jun 2007 14:30:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l55EUsTY028249; Tue, 5 Jun 2007 10:30:55 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Tue, 5 Jun 2007 10:12:17 -0400 User-Agent: KMail/1.9.6 References: <20070604220649.E606@10.0.0.1> In-Reply-To: <20070604220649.E606@10.0.0.1> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200706051012.18864.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 05 Jun 2007 10:30:55 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3360/Tue Jun 5 00:32:46 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx X-Mailman-Approved-At: Tue, 05 Jun 2007 14:33:00 +0000 Cc: kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, marcl@freebsd.org, arch@freebsd.org, jake@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: Re: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 14:30:58 -0000 On Tuesday 05 June 2007 01:32:46 am Jeff Roberson wrote: > For every architecture we need to support a new features in cpu_switch() > and cpu_throw() before they can support per-cpu schedlock. I'll describe > those below. I'm soliciting help or advice in implementing these on > platforms other than x86, and amd64, especially on ia64 where things are > implemented in C! > > I checked in the new version of cpu_switch() for amd64 today after > threadlock went in. Basically, we have to release a thread's lock when > it's switched out and acquire a lock when it's switched in. > > The release must happen after we're totally done with the stack and > vmspace of the thread to be switched out. On amd64 this meant after we > clear the active bits for tlb shootdown. The release actually makes use > of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to > this argument rather than unlocking a real lock. td_lock has previously > been set to the blocked lock, which is always blocked. Threads > spinning in thread_lock() will notice the td_lock pointer change and > acquire the new lock. So this is simple, just a non-atomic store with a > pointer passed as an argument. On amd64: > > movq %rdx, TD_LOCK(%rdi) /* Release the old thread */ > > The acquire part is slightly more complicated and involves a little loop. > We don't actually have to spin trying to lock the thread. We just spin > until it's no longer set to the blocked lock. The switching thread > already owns the per-cpu scheduler lock for the current cpu. If we're > switching into a thread that is set to the blocked_lock another cpu is > about to set it to our current cpu's lock via the mtx argument mentioned > above. On amd64 we have: > > /* Wait for the new thread to become unblocked */ > movq $blocked_lock, %rdx > 1: > movq TD_LOCK(%rsi),%rcx > cmpq %rcx, %rdx > je 1b If this is to handle a thread migrating from one CPU to the next (and there's no interlock to control migration, otherwise you wouldn't have to spin here) then you will need memory barriers on the first write (i.e. the first write above should be an atomic_store_rel()) and the equivalent of an _acq barrier here. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 14:30:58 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id A15D816A468; Tue, 5 Jun 2007 14:30:58 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id ECB3213C447; Tue, 5 Jun 2007 14:30:57 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l55EUsTY028249; Tue, 5 Jun 2007 10:30:55 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Tue, 5 Jun 2007 10:12:17 -0400 User-Agent: KMail/1.9.6 References: <20070604220649.E606@10.0.0.1> In-Reply-To: <20070604220649.E606@10.0.0.1> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200706051012.18864.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Tue, 05 Jun 2007 10:30:55 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3360/Tue Jun 5 00:32:46 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx X-Mailman-Approved-At: Tue, 05 Jun 2007 14:33:00 +0000 Cc: kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, marcl@freebsd.org, arch@freebsd.org, jake@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: Re: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 14:30:58 -0000 On Tuesday 05 June 2007 01:32:46 am Jeff Roberson wrote: > For every architecture we need to support a new features in cpu_switch() > and cpu_throw() before they can support per-cpu schedlock. I'll describe > those below. I'm soliciting help or advice in implementing these on > platforms other than x86, and amd64, especially on ia64 where things are > implemented in C! > > I checked in the new version of cpu_switch() for amd64 today after > threadlock went in. Basically, we have to release a thread's lock when > it's switched out and acquire a lock when it's switched in. > > The release must happen after we're totally done with the stack and > vmspace of the thread to be switched out. On amd64 this meant after we > clear the active bits for tlb shootdown. The release actually makes use > of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to > this argument rather than unlocking a real lock. td_lock has previously > been set to the blocked lock, which is always blocked. Threads > spinning in thread_lock() will notice the td_lock pointer change and > acquire the new lock. So this is simple, just a non-atomic store with a > pointer passed as an argument. On amd64: > > movq %rdx, TD_LOCK(%rdi) /* Release the old thread */ > > The acquire part is slightly more complicated and involves a little loop. > We don't actually have to spin trying to lock the thread. We just spin > until it's no longer set to the blocked lock. The switching thread > already owns the per-cpu scheduler lock for the current cpu. If we're > switching into a thread that is set to the blocked_lock another cpu is > about to set it to our current cpu's lock via the mtx argument mentioned > above. On amd64 we have: > > /* Wait for the new thread to become unblocked */ > movq $blocked_lock, %rdx > 1: > movq TD_LOCK(%rsi),%rcx > cmpq %rcx, %rdx > je 1b If this is to handle a thread migrating from one CPU to the next (and there's no interlock to control migration, otherwise you wouldn't have to spin here) then you will need memory barriers on the first write (i.e. the first write above should be an atomic_store_rel()) and the equivalent of an _acq barrier here. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 18:51:37 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1127016A46E; Tue, 5 Jun 2007 18:51:37 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id AF5AC13C46A; Tue, 5 Jun 2007 18:51:36 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l55IpXTY070007 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Tue, 5 Jun 2007 14:51:34 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Tue, 5 Jun 2007 11:51:18 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: John Baldwin In-Reply-To: <200706051012.18864.jhb@freebsd.org> Message-ID: <20070605114745.I606@10.0.0.1> References: <20070604220649.E606@10.0.0.1> <200706051012.18864.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Tue, 05 Jun 2007 18:57:35 +0000 Cc: marcel@freebsd.org, kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, arch@freebsd.org, jake@freebsd.org, freebsd-arch@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: Re: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 18:51:37 -0000 On Tue, 5 Jun 2007, John Baldwin wrote: > On Tuesday 05 June 2007 01:32:46 am Jeff Roberson wrote: >> For every architecture we need to support a new features in cpu_switch() >> and cpu_throw() before they can support per-cpu schedlock. I'll describe >> those below. I'm soliciting help or advice in implementing these on >> platforms other than x86, and amd64, especially on ia64 where things are >> implemented in C! >> >> I checked in the new version of cpu_switch() for amd64 today after >> threadlock went in. Basically, we have to release a thread's lock when >> it's switched out and acquire a lock when it's switched in. >> >> The release must happen after we're totally done with the stack and >> vmspace of the thread to be switched out. On amd64 this meant after we >> clear the active bits for tlb shootdown. The release actually makes use >> of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to >> this argument rather than unlocking a real lock. td_lock has previously >> been set to the blocked lock, which is always blocked. Threads >> spinning in thread_lock() will notice the td_lock pointer change and >> acquire the new lock. So this is simple, just a non-atomic store with a >> pointer passed as an argument. On amd64: >> >> movq %rdx, TD_LOCK(%rdi) /* Release the old thread */ >> >> The acquire part is slightly more complicated and involves a little loop. >> We don't actually have to spin trying to lock the thread. We just spin >> until it's no longer set to the blocked lock. The switching thread >> already owns the per-cpu scheduler lock for the current cpu. If we're >> switching into a thread that is set to the blocked_lock another cpu is >> about to set it to our current cpu's lock via the mtx argument mentioned >> above. On amd64 we have: >> >> /* Wait for the new thread to become unblocked */ >> movq $blocked_lock, %rdx >> 1: >> movq TD_LOCK(%rsi),%rcx >> cmpq %rcx, %rdx >> je 1b > > If this is to handle a thread migrating from one CPU to the next (and there's > no interlock to control migration, otherwise you wouldn't have to spin here) > then you will need memory barriers on the first write (i.e. the first write > above should be an atomic_store_rel()) and the equivalent of an _acq barrier > here. So, thanks for pointing this out. Attilio also mentions that on x86 and amd64 we need a pause in the wait loop. As we discussed, we can just use sfence rather than atomics on amd64, however, x86 will need atomics since you can't rely on the presence of *fence. Other architectures will have to ensure memory ordering as appropriate. Jeff > > -- > John Baldwin > From owner-freebsd-arch@FreeBSD.ORG Tue Jun 5 18:51:37 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1127016A46E; Tue, 5 Jun 2007 18:51:37 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id AF5AC13C46A; Tue, 5 Jun 2007 18:51:36 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l55IpXTY070007 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Tue, 5 Jun 2007 14:51:34 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Tue, 5 Jun 2007 11:51:18 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: John Baldwin In-Reply-To: <200706051012.18864.jhb@freebsd.org> Message-ID: <20070605114745.I606@10.0.0.1> References: <20070604220649.E606@10.0.0.1> <200706051012.18864.jhb@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Mailman-Approved-At: Tue, 05 Jun 2007 18:57:35 +0000 Cc: marcel@freebsd.org, kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, arch@freebsd.org, jake@freebsd.org, freebsd-arch@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: Re: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Jun 2007 18:51:37 -0000 On Tue, 5 Jun 2007, John Baldwin wrote: > On Tuesday 05 June 2007 01:32:46 am Jeff Roberson wrote: >> For every architecture we need to support a new features in cpu_switch() >> and cpu_throw() before they can support per-cpu schedlock. I'll describe >> those below. I'm soliciting help or advice in implementing these on >> platforms other than x86, and amd64, especially on ia64 where things are >> implemented in C! >> >> I checked in the new version of cpu_switch() for amd64 today after >> threadlock went in. Basically, we have to release a thread's lock when >> it's switched out and acquire a lock when it's switched in. >> >> The release must happen after we're totally done with the stack and >> vmspace of the thread to be switched out. On amd64 this meant after we >> clear the active bits for tlb shootdown. The release actually makes use >> of a new 'mtx' argument to cpu_switch() and sets the td_lock pointer to >> this argument rather than unlocking a real lock. td_lock has previously >> been set to the blocked lock, which is always blocked. Threads >> spinning in thread_lock() will notice the td_lock pointer change and >> acquire the new lock. So this is simple, just a non-atomic store with a >> pointer passed as an argument. On amd64: >> >> movq %rdx, TD_LOCK(%rdi) /* Release the old thread */ >> >> The acquire part is slightly more complicated and involves a little loop. >> We don't actually have to spin trying to lock the thread. We just spin >> until it's no longer set to the blocked lock. The switching thread >> already owns the per-cpu scheduler lock for the current cpu. If we're >> switching into a thread that is set to the blocked_lock another cpu is >> about to set it to our current cpu's lock via the mtx argument mentioned >> above. On amd64 we have: >> >> /* Wait for the new thread to become unblocked */ >> movq $blocked_lock, %rdx >> 1: >> movq TD_LOCK(%rsi),%rcx >> cmpq %rcx, %rdx >> je 1b > > If this is to handle a thread migrating from one CPU to the next (and there's > no interlock to control migration, otherwise you wouldn't have to spin here) > then you will need memory barriers on the first write (i.e. the first write > above should be an atomic_store_rel()) and the equivalent of an _acq barrier > here. So, thanks for pointing this out. Attilio also mentions that on x86 and amd64 we need a pause in the wait loop. As we discussed, we can just use sfence rather than atomics on amd64, however, x86 will need atomics since you can't rely on the presence of *fence. Other architectures will have to ensure memory ordering as appropriate. Jeff > > -- > John Baldwin > From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 01:30:59 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id E100216A41F; Wed, 6 Jun 2007 01:30:59 +0000 (UTC) (envelope-from eric.lemar@isilon.com) Received: from seaxch07.isilon.com (seaxch07.isilon.com [70.103.106.46]) by mx1.freebsd.org (Postfix) with ESMTP id CB3ED13C46E; Wed, 6 Jun 2007 01:30:59 +0000 (UTC) (envelope-from eric.lemar@isilon.com) x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Tue, 5 Jun 2007 18:17:40 -0700 Message-ID: <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: *at family of syscalls in FreeBSD Thread-Index: Acemx1pfnYUkoPOWS7Gw0zmMSQe21gBER+OY References: <20070604162430.GA76813@freebsd.org> From: "Eric Lemar" To: "Roman Divacky" X-Mailman-Approved-At: Wed, 06 Jun 2007 02:48:43 +0000 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: arch@freebsd.org Subject: RE: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 01:31:00 -0000 I'm definitely a fan of this API. Aside from the general thread-related utility of this API, it provides a reasonable API for accessing windows-style ADS streams (subfiles) on a filesystem that supports them and is becoming reasonably cross-platform. This lets you handle things like ADS hanging off directories in a comparatively sane manner. We've actually implemented a subset of these syscalls in-house (Isilon) for use with our filesystem, largely for the ADS-related functionality. Generally speaking, in our tree most of the traditional non-'at' = syscalls are just small kernel wrappers around the 'at' interfaces. Overall ends up looking fairly clean and we've ended up using them even in places where we don't need the ADS functionality just because they are so convenient. If you're interested in implementing this API I'd be happy to talk about our implementation and see whether the relevant parts of our = implementation would be useful for the general community. thanks, Eric Lemar -------------------------- Eric Lemar Software Development Engineer Isilon Systems elemar@isilon.com ________________________________ From: owner-freebsd-arch@freebsd.org on behalf of Roman Divacky Sent: Mon 6/4/2007 9:24 AM To: arch@freebsd.org Subject: *at family of syscalls in FreeBSD Hi, Starting from Linux 2.6.16, there is available so called *at syscalls. = For example openat(), linkat() etc. Those syscalls are used to avoid races in = threaded programs and to implement per-thread CWD. in other words they are usefull. Thats = why Linux implemented them. I am currently implementing those in our Linuxulator. As suggested by = Robert Watson I implemented general kern_fooat() functions and wrapped them around = those kern_fooat() functions. It works ok and everything. But I want to introduce native = *at syscalls for FreeBSD binaries. Hence I am here to discuss the API. My suggestion: use Linux API with some slight naming changes: syscalls: =20 openat(), mkdirat(), mknodat(), chownat(), utimesat(), statat(), = unlinkat(), renameat(), linkat(), symlinkat(), readlinkat(), chmodat(), accessat(). example of a syscall: int openat(int dirfd, char *path, int flags, int mode); ie. exactly the same API as Linux have with the exception of naming them = sanely, ie. instead for example fchownat() we have chownat() because it does not operate on = a fd but on a path. I am not sure about compatibility but we can always introduce a weak = reference like fchownat() -> chownat(). =20 I want to have special AT_FDCWD -100 define for "cwd" argument, just = like linux. I am currently finishing my linuxulator-side work and I'd like to see = this in 7.0R so please comment on the API and the idea of introducing those syscalls in = FreeBSD. thank you Roman Divacky _______________________________________________ freebsd-arch@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-arch To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 07:44:33 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 281E216A421 for ; Wed, 6 Jun 2007 07:44:33 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id D961513C484 for ; Wed, 6 Jun 2007 07:44:32 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 4455A8BDB7E; Wed, 6 Jun 2007 09:44:31 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vfxRay1T4z+U; Wed, 6 Jun 2007 09:44:30 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 2ADAC8BDB73; Wed, 6 Jun 2007 09:44:30 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l567iTvF042064; Wed, 6 Jun 2007 09:44:29 +0200 (CEST) (envelope-from rdivacky) Date: Wed, 6 Jun 2007 09:44:29 +0200 From: Roman Divacky To: Eric Lemar Message-ID: <20070606074429.GA42032@freebsd.org> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> User-Agent: Mutt/1.4.2.3i Cc: arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 07:44:33 -0000 On Tue, Jun 05, 2007 at 06:17:40PM -0700, Eric Lemar wrote: > I'm definitely a fan of this API. Aside from the general thread-related > utility of this API, it provides a reasonable API for accessing > windows-style ADS streams (subfiles) on a filesystem that supports them > and is becoming reasonably cross-platform. This lets you handle things > like ADS hanging off directories in a comparatively sane manner. > > We've actually implemented a subset of these syscalls in-house (Isilon) > for use with our filesystem, largely for the ADS-related functionality. > Generally speaking, in our tree most of the traditional non-'at' syscalls > are just small kernel wrappers around the 'at' interfaces. Overall ends > up looking fairly clean and we've ended up using them even in places > where we don't need the ADS functionality just because they are so > convenient. > > If you're interested in implementing this API I'd be happy to talk about > our implementation and see whether the relevant parts of our implementation > would be useful for the general community. my current patch is at: www.vlakno.cz/~rdivacky/linux_at.patch it does not implement the native fbsd syscalls, only the linuxulator ones but adding those is a matter of minutes. I asked for a review by pjd and/or rwatson and hopefully this will get commited soon.. roman From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 08:03:51 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5788A16A400; Wed, 6 Jun 2007 08:03:51 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from relay01.kiev.sovam.com (relay01.kiev.sovam.com [62.64.120.200]) by mx1.freebsd.org (Postfix) with ESMTP id E226D13C4C6; Wed, 6 Jun 2007 08:03:50 +0000 (UTC) (envelope-from kostikbel@gmail.com) Received: from [89.162.146.170] (helo=skuns.kiev.zoral.com.ua) by relay01.kiev.sovam.com with esmtps (TLSv1:AES256-SHA:256) (Exim 4.60) (envelope-from ) id 1HvqUe-0004Z1-EO; Wed, 06 Jun 2007 11:03:49 +0300 Received: from deviant.kiev.zoral.com.ua (root@deviant.kiev.zoral.com.ua [10.1.1.148]) by skuns.kiev.zoral.com.ua (8.14.1/8.14.1) with ESMTP id l5683TlZ052259 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 6 Jun 2007 11:03:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: from deviant.kiev.zoral.com.ua (kostik@localhost [127.0.0.1]) by deviant.kiev.zoral.com.ua (8.14.1/8.14.1) with ESMTP id l5683Tpg090925; Wed, 6 Jun 2007 11:03:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) Received: (from kostik@localhost) by deviant.kiev.zoral.com.ua (8.14.1/8.14.1/Submit) id l5683TVJ090924; Wed, 6 Jun 2007 11:03:29 +0300 (EEST) (envelope-from kostikbel@gmail.com) X-Authentication-Warning: deviant.kiev.zoral.com.ua: kostik set sender to kostikbel@gmail.com using -f Date: Wed, 6 Jun 2007 11:03:28 +0300 From: Kostik Belousov To: Roman Divacky Message-ID: <20070606080328.GZ2268@deviant.kiev.zoral.com.ua> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="8f0Rp8CLSOFvlIcd" Content-Disposition: inline In-Reply-To: <20070606074429.GA42032@freebsd.org> User-Agent: Mutt/1.4.2.2i X-Virus-Scanned: ClamAV version 0.90.2, clamav-milter version 0.90.2 on skuns.kiev.zoral.com.ua X-Virus-Status: Clean X-Spam-Status: No, score=-1.4 required=5.0 tests=ALL_TRUSTED autolearn=failed version=3.1.8 X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on skuns.kiev.zoral.com.ua X-Scanner-Signature: 362d5034abbfed611c52a2913ea5ff69 X-DrWeb-checked: yes X-SpamTest-Envelope-From: kostikbel@gmail.com X-SpamTest-Group-ID: 00000000 X-SpamTest-Header: Not Detected X-SpamTest-Info: Profiles 1119 [June 05 2007] X-SpamTest-Info: helo_type=3 X-SpamTest-Method: none X-SpamTest-Rate: 0 X-SpamTest-Status: Not detected X-SpamTest-Status-Extended: not_detected X-SpamTest-Version: SMTP-Filter Version 3.0.0 [0255], KAS30/Release Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 08:03:51 -0000 --8f0Rp8CLSOFvlIcd Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jun 06, 2007 at 09:44:29AM +0200, Roman Divacky wrote: > On Tue, Jun 05, 2007 at 06:17:40PM -0700, Eric Lemar wrote: > > I'm definitely a fan of this API. Aside from the general thread-related > > utility of this API, it provides a reasonable API for accessing > > windows-style ADS streams (subfiles) on a filesystem that supports them > > and is becoming reasonably cross-platform. This lets you handle things > > like ADS hanging off directories in a comparatively sane manner. > >=20 > > We've actually implemented a subset of these syscalls in-house (Isilon) > > for use with our filesystem, largely for the ADS-related functionality. > > Generally speaking, in our tree most of the traditional non-'at' syscal= ls > > are just small kernel wrappers around the 'at' interfaces. Overall ends > > up looking fairly clean and we've ended up using them even in places > > where we don't need the ADS functionality just because they are so > > convenient. > >=20 > > If you're interested in implementing this API I'd be happy to talk about > > our implementation and see whether the relevant parts of our implementa= tion > > would be useful for the general community. >=20 > my current patch is at: www.vlakno.cz/~rdivacky/linux_at.patch >=20 > it does not implement the native fbsd syscalls, only the linuxulator ones > but adding those is a matter of minutes. I asked for a review by pjd and/= or > rwatson and hopefully this will get commited soon.. I think it would be very useful to look at Isilon implementation, and possi= bly merge your and their patch. In particular, it could give an insight of what real uses for the API/KPI are. --8f0Rp8CLSOFvlIcd Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (FreeBSD) iD8DBQFGZmpQC3+MBN1Mb4gRAr7vAJ9uMZ8xkaYgtGoSUb0s/nZ/e2N3sgCcCBsD Pv+Ch7hZhNSLbIArrYrPFZM= =YU8C -----END PGP SIGNATURE----- --8f0Rp8CLSOFvlIcd-- From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 16:46:58 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7BC8F16A46C for ; Wed, 6 Jun 2007 16:46:58 +0000 (UTC) (envelope-from howard0su@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.176]) by mx1.freebsd.org (Postfix) with ESMTP id 1A05613C4B0 for ; Wed, 6 Jun 2007 16:46:57 +0000 (UTC) (envelope-from howard0su@gmail.com) Received: by py-out-1112.google.com with SMTP id a29so371614pyi for ; Wed, 06 Jun 2007 09:46:57 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:mime-version:content-type:content-transfer-encoding:content-disposition; b=oF0Pnot81oUmHqHJ9ovZDX5PG7IqiO12Nm+bAwC96ngtq/cYEAjgccXZxwS7DGGutEnBzVIk1DoPs3f6dhaesk+k3HRET844oaZz9OfL/9H4K1RMSNcqbckZLPEk+dpwPDaCHVw1nyrDtS22g5ptUwhXM1Wox0MrnGWPS00ZbXU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:mime-version:content-type:content-transfer-encoding:content-disposition; b=haY3vRO7OYE2CHrVnuxFkF3FszEWi5AWrDboPTbYYXcQimthZToBqOyDu0Lb2OPee1anPi5r3wGbEJ6N9lSYygfcRwKDY12o3YYoDzwShEdx2MdudBjRBNhIJl9Qb1TFnCuhFGzQ/I4xwmplIypYqdqwezNuqRiW8KMA7cxoK7E= Received: by 10.35.71.1 with SMTP id y1mr1210488pyk.1181146690980; Wed, 06 Jun 2007 09:18:10 -0700 (PDT) Received: by 10.35.79.18 with HTTP; Wed, 6 Jun 2007 09:18:10 -0700 (PDT) Message-ID: Date: Thu, 7 Jun 2007 00:18:10 +0800 From: "Howard Su" To: arch@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline Cc: alc@freebsd.org Subject: help on lock around vm_page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 16:46:58 -0000 I want some helps from VM guru. I try to fix a panic in tmpfs. In order to push tmpfs into -Current, I really want some help to solve this. 1. we allocate an object from vm_pager_alloc(OBJT_SWAP, ...) when create a file. 2. the panic is during handling write op: a) find the first page we want to write b) call vm_page_grab to get the page from object. c) call use sf_buf_alloc to map it into kernel_map d) use uiomove to move the data e) mark page as dirty f) loop to a until all pages are handled. there is a race condition. while doing b-c & e, we hold the OBJ_LOCK/page_queue_lock. when doing d, we have to drop the locks to call uiomove. When calling uio move, the page may moved to cache queue since in that time it is not dirty. There is a solution that we allocate a page buffer. Before a), we uiomove it to the buffer and replace uiomove with a bcopy in d). Then, we can hold lock in b - e. I feel this will cause performance problem. For the detailed code, please check: http://perforce.freebsd.org/fileViewer.cgi?FSPC=//depot/user/howardsu/truss/sys/fs/tmpfs/tmpfs%5fvnops.c&REV=30 function: tmpfs_uio_xfer() Any idea to close this race condition? PS: If you can review my code about usage of vm, it will be appreciated. -- -Howard From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 18:04:46 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id EE84716A468 for ; Wed, 6 Jun 2007 18:04:46 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: from mail2.fluidhosting.com (mx21.fluidhosting.com [204.14.89.4]) by mx1.freebsd.org (Postfix) with SMTP id 87B8713C44C for ; Wed, 6 Jun 2007 18:04:46 +0000 (UTC) (envelope-from dougb@FreeBSD.org) Received: (qmail 25462 invoked by uid 399); 6 Jun 2007 17:38:05 -0000 Received: from localhost (HELO lap.dougb.net) (dougb@dougbarton.us@127.0.0.1) by localhost with SMTP; 6 Jun 2007 17:38:05 -0000 X-Originating-IP: 127.0.0.1 Message-ID: <4666F0FB.8020101@FreeBSD.org> Date: Wed, 06 Jun 2007 10:38:03 -0700 From: Doug Barton Organization: http://www.FreeBSD.org/ User-Agent: Thunderbird 2.0.0.0 (X11/20070525) MIME-Version: 1.0 To: Roman Divacky References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> In-Reply-To: <20070606074429.GA42032@freebsd.org> X-Enigmail-Version: 0.95.0 OpenPGP: id=D5B2F0FB Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 18:04:47 -0000 Roman Divacky wrote: > my current patch is at: www.vlakno.cz/~rdivacky/linux_at.patch > > it does not implement the native fbsd syscalls, only the linuxulator ones > but adding those is a matter of minutes. I asked for a review by pjd and/or > rwatson and hopefully this will get commited soon.. My recollection of the last round of discussion was that you were asked to implement these in our base, then wrap the linux versions. If it's trivial to implement, it should probably be done that way first. I also think you should take the offer collaboration with the Isilon guys. Getting the benefit of their experience is a good thing. :) Doug -- This .signature sanitized for your protection From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 21:18:26 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8976616A468; Wed, 6 Jun 2007 21:18:26 +0000 (UTC) (envelope-from eric.lemar@isilon.com) Received: from seaxch07.isilon.com (seaxch07.isilon.com [70.103.106.46]) by mx1.freebsd.org (Postfix) with ESMTP id 4A7AD13C447; Wed, 6 Jun 2007 21:18:26 +0000 (UTC) (envelope-from eric.lemar@isilon.com) x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Wed, 6 Jun 2007 14:18:26 -0700 Message-ID: <896DB1FBFFD5A145833D9DA08CA12A85051A83@seaxch07.desktop.isilon.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: *at family of syscalls in FreeBSD Thread-Index: AceoETWLuByOmf2mTHi6v5uudmU34gAYaTyP References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <20070606080328.GZ2268@deviant.kiev.zoral.com.ua> From: "Eric Lemar" To: "Kostik Belousov" , "Roman Divacky" Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: arch@freebsd.org Subject: RE: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 21:18:26 -0000 Ok, I'll get together a rough patch of what we are doing. Strict Posix = conformance wasn't a top priority and it's currently in a FreeBSD 6ish = branch, so I'm sure it will require at least a small amount of work to = be useful. Unfortunately there's a bit of other code to untangle this = from and (as always) other stuff to keep me from spending time = untangling, so it will probably be a few days before I get a chance to = put it together... =20 thanks, Eric Lemar ________________________________ From: Kostik Belousov [mailto:kostikbel@gmail.com] Sent: Wed 6/6/2007 1:03 AM To: Roman Divacky Cc: Eric Lemar; arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD On Wed, Jun 06, 2007 at 09:44:29AM +0200, Roman Divacky wrote: > On Tue, Jun 05, 2007 at 06:17:40PM -0700, Eric Lemar wrote: > > I'm definitely a fan of this API. Aside from the general = thread-related > > utility of this API, it provides a reasonable API for accessing > > windows-style ADS streams (subfiles) on a filesystem that supports = them > > and is becoming reasonably cross-platform. This lets you handle = things > > like ADS hanging off directories in a comparatively sane manner. > > > > We've actually implemented a subset of these syscalls in-house = (Isilon) > > for use with our filesystem, largely for the ADS-related = functionality. > > Generally speaking, in our tree most of the traditional non-'at' = syscalls > > are just small kernel wrappers around the 'at' interfaces. Overall = ends > > up looking fairly clean and we've ended up using them even in places > > where we don't need the ADS functionality just because they are so > > convenient. > > > > If you're interested in implementing this API I'd be happy to talk = about > > our implementation and see whether the relevant parts of our = implementation > > would be useful for the general community. > > my current patch is at: www.vlakno.cz/~rdivacky/linux_at.patch > > it does not implement the native fbsd syscalls, only the linuxulator = ones > but adding those is a matter of minutes. I asked for a review by pjd = and/or > rwatson and hopefully this will get commited soon.. I think it would be very useful to look at Isilon implementation, and = possibly merge your and their patch. In particular, it could give an insight of = what real uses for the API/KPI are. From owner-freebsd-arch@FreeBSD.ORG Wed Jun 6 22:31:11 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0FD8816A41F; Wed, 6 Jun 2007 22:31:11 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id D9C2613C45B; Wed, 6 Jun 2007 22:31:10 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l56MV74Q046070 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Wed, 6 Jun 2007 18:31:09 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Wed, 6 Jun 2007 15:30:49 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Bruce Evans In-Reply-To: <20070605214404.X47001@delplex.bde.org> Message-ID: <20070606152352.H606@10.0.0.1> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Attilio Rao , freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 06 Jun 2007 22:31:11 -0000 On Tue, 5 Jun 2007, Bruce Evans wrote: > On Tue, 5 Jun 2007, Attilio Rao wrote: > >> Yes, I always wondered why proc0_post() doesn't initialize [s,i,u]ticks >> too. >> However, could you please give a look and a try to this patch: >> http://users.gufi.org/~rookie/works/patches/schedlock/proc_post.diff >> >> and see if it solves your problem. > > This can probably be fixed more simply by calling rufetch() to reset the > time state in threads as a side effect. Do this before resetting the > state in the process. Ok, I agree with bde here, just call rufetch and this will clear each thread, and then you can clear the rux in the proc. I'd like to make a list of the remaining problems with rusage and potential fixes. Then we can decide which ones myself and attilio will resolve immediately to clean up some of the effect of the sched lock changes. 1) The ruadd() in thread_exit() is not safe since we're accessing another thread's unlocked rusage structure. Potential solution is to allocate p_ru as part of the proc struct and add into there, which will be protected by the PROC_SLOCK, which bde seemed to like better anyway. 2) We may lose information between exit1() and thread_exit() due to the way p_ru is initialized before we're done exiting. There also seems to be a race where wait() operates on a process before it's done in thread_exit() which means wait may return rusage information without the child added in! The solution will be to fix this race, and then access p_ru directly in wait(). 3) There is no locking around rufetch() and calcru(). calcru() may apply new rux values to an old rusage, giving inaccurate results. The solution is to either require the proc slock around both calls, or provide a new routine which does the fetch and calc while grabbing the lock itself. 4) libkvm has had the rusage support if'd out because I broke it when I removed pstats.p_ru. Do we care about rusage in libkvm? Should we go to the trouble of traversing the list of threads and sum'ing it up? Can we export some subset of the information? Does anyone use this (other than bde ;). Jeff > > ANother minor problem is that proc0_post() isn't the right place to > reset the time state except for proc0. It hacks on some of the time > state for all processes and thus for all CPUs, but only resets switchtime > and switchticks for the current CPU. Resetting of switchtime and > switchticks for startup of other CPUs now seems to be in sched_throw(). > I think this is not properly synchronous with the resetting of the > rest of the state, but the errors from this are tiny. > > Bruce > From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 00:41:49 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3B7EB16A46C for ; Thu, 7 Jun 2007 00:41:49 +0000 (UTC) (envelope-from howard0su@gmail.com) Received: from py-out-1112.google.com (py-out-1112.google.com [64.233.166.177]) by mx1.freebsd.org (Postfix) with ESMTP id B2AF213C455 for ; Thu, 7 Jun 2007 00:41:48 +0000 (UTC) (envelope-from howard0su@gmail.com) Received: by py-out-1112.google.com with SMTP id a29so588603pyi for ; Wed, 06 Jun 2007 17:41:48 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=GlMQOs/ZpG4BWihpDQYceptsVrTxgcaRilF2Rlun0guRK7W2WPOc2Q7sixOfKHIxraX3qQrfROyxTw7V2RYYjviTQiHwFwqsUoZa0DhVjo2V018Oi6Ip5vfhzLhpLnt/fOB7vPHQTp+kTq4hEHhvUh8dewgiOGEjWkz7InQ1A2k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=uhxLbuGJFSUCTBRvbOkxYPq1BcXAny41XJORF11mAwtABF/Ps8whFt0nn49cl6mml9UOUOGZh1R7QdWJ3hC3vJ8kmmsSRoVZeHH/91c5EovA9xpuD7O/nxpYpvrmkxi0M2O7cYP9saMNiTTBB46sslp9JmhtaLxL67N1PuiMbEo= Received: by 10.35.71.1 with SMTP id y1mr1914257pyk.1181176908060; Wed, 06 Jun 2007 17:41:48 -0700 (PDT) Received: by 10.35.81.19 with HTTP; Wed, 6 Jun 2007 17:41:48 -0700 (PDT) Message-ID: Date: Thu, 7 Jun 2007 08:41:48 +0800 From: "Howard Su" To: arch@freebsd.org In-Reply-To: MIME-Version: 1.0 References: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: alc@freebsd.org Subject: Re: help on lock around vm_page X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 00:41:49 -0000 Some further reading give me a different message: The pages returned by vm_page_grab should have VPO_BUSY flag which will prevent anyone else to move this page to cache. So current approach should be ok. Am I right? Maybe there is some edge cases since I only meet this bug in a low memory situation. -- -Howard On 6/7/07, Howard Su wrote: > > I want some helps from VM guru. I try to fix a panic in tmpfs. In > order to push tmpfs into -Current, I really want some help to solve > this. > > 1. we allocate an object from vm_pager_alloc(OBJT_SWAP, ...) when create a > file. > 2. the panic is during handling write op: > a) find the first page we want to write > b) call vm_page_grab to get the page from object. > c) call use sf_buf_alloc to map it into kernel_map > d) use uiomove to move the data > e) mark page as dirty > f) loop to a until all pages are handled. > > there is a race condition. while doing b-c & e, we hold the > OBJ_LOCK/page_queue_lock. when doing d, we have to drop the locks to > call uiomove. When calling uio move, the page may moved to cache queue > since in that time it is not dirty. > > There is a solution that we allocate a page buffer. Before a), we > uiomove it to the buffer and replace uiomove with a bcopy in d). Then, > we can hold lock in b - e. I feel this will cause performance problem. > > For the detailed code, please check: > > http://perforce.freebsd.org/fileViewer.cgi?FSPC=//depot/user/howardsu/truss/sys/fs/tmpfs/tmpfs%5fvnops.c&REV=30 > > function: tmpfs_uio_xfer() > > Any idea to close this race condition? > > PS: If you can review my code about usage of vm, it will be appreciated. > > -- > -Howard > From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 07:04:59 2007 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3FACE16A468 for ; Thu, 7 Jun 2007 07:04:59 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id E753C13C447 for ; Thu, 7 Jun 2007 07:04:58 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 5BD908BDC3F; Thu, 7 Jun 2007 09:04:57 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RUdjiTYB4njn; Thu, 7 Jun 2007 09:04:56 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 56E8E8BDC2D; Thu, 7 Jun 2007 09:04:56 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l5774t7P071117; Thu, 7 Jun 2007 09:04:55 +0200 (CEST) (envelope-from rdivacky) Date: Thu, 7 Jun 2007 09:04:55 +0200 From: Roman Divacky To: Doug Barton Message-ID: <20070607070455.GA71012@freebsd.org> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4666F0FB.8020101@FreeBSD.org> User-Agent: Mutt/1.4.2.3i Cc: Eric Lemar , arch@FreeBSD.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 07:04:59 -0000 On Wed, Jun 06, 2007 at 10:38:03AM -0700, Doug Barton wrote: > Roman Divacky wrote: > > > my current patch is at: www.vlakno.cz/~rdivacky/linux_at.patch > > > > it does not implement the native fbsd syscalls, only the linuxulator ones > > but adding those is a matter of minutes. I asked for a review by pjd and/or > > rwatson and hopefully this will get commited soon.. > > My recollection of the last round of discussion was that you were > asked to implement these in our base, then wrap the linux versions. If > it's trivial to implement, it should probably be done that way first. well.. my implementation does exactly that :) I changed namei() routine to check ni_startdir for possible startdir other then CWD and made the kern_fooat() use it. the change is mostly self-contained (minus prototypes in sys/sys/syscallsubr.h) in vfs_syscalls. with a very slight change (5 lines?) to vfs_lookup.c + namei.h its quite simple. I cant comment the Isilon's version. My patch (the approach) has been OKed by rwatson@ and pjd@. I'd really love to see this go to -current before RELENG_7 so time is rushing. can someone from Isilon comment their version so we can compare benefits etc.? roman From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 19:32:50 2007 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2E90E16A46C for ; Thu, 7 Jun 2007 19:32:50 +0000 (UTC) (envelope-from stas@FreeBSD.org) Received: from com1.ht-systems.ru (com1.ht-systems.ru [83.97.104.204]) by mx1.freebsd.org (Postfix) with ESMTP id DD21013C447 for ; Thu, 7 Jun 2007 19:32:49 +0000 (UTC) (envelope-from stas@FreeBSD.org) Received: from [85.21.245.235] (helo=phonon.SpringDaemons.com) by com1.ht-systems.ru with esmtpa (Exim 4.62) (envelope-from ) id 1HwLwW-000325-04; Thu, 07 Jun 2007 21:38:40 +0400 Received: from localhost (localhost [127.0.0.1]) by phonon.SpringDaemons.com (Postfix) with SMTP id DC22F1145D; Thu, 7 Jun 2007 21:36:55 +0400 (MSD) Date: Thu, 7 Jun 2007 21:36:50 +0400 From: Stanislav Sedov To: freebsd-arch@FreeBSD.org Message-Id: <20070607213650.c02130bf.stas@FreeBSD.org> Organization: The FreeBSD Project X-Mailer: carrier-pigeon X-Voice: +7 916 849 20 23 X-XMPP: ssedov@jabber.ru X-ICQ: 208105021 X-Yahoo: stanislav_sedov X-PGP-Fingerprint: F21E D6CC 5626 9609 6CE2 A385 2BF5 5993 EB26 9581 X-University: MEPhI Mime-Version: 1.0 Content-Type: multipart/signed; protocol="application/pgp-signature"; micalg="PGP-SHA1"; boundary="Signature=_Thu__7_Jun_2007_21_36_50_+0400_q=oaWEyO7sI7gu3K" X-Spam-Flag: SKIP X-Spam-Yversion: Spamooborona 1.6.0 Cc: freebsd-hackers@FreeBSD.org, timur@gnu.org Subject: setegid bug X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 19:32:50 -0000 --Signature=_Thu__7_Jun_2007_21_36_50_+0400_q=oaWEyO7sI7gu3K Content-Type: text/plain; charset=US-ASCII Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hi! Recently several FreeBSD samba users reported a scary problem with samba (http://bugzilla.samba.org/?id=3D3990). Further research in cooperation with Timur Bakeyev (timur) showed, that we have a little problem with setegid implementation. In FreeBSD (and even in 4.4BSD-Lite2) egid of the process is merely groups[0], so calling seteuid function we simply override the first of supplementary groups. However, POSIX says that not rgid, not any of supplementary groups should bot be rewritten in setegid call. There's some comments about optimizations which caused so scary implementation, but I can't get what these optimizations are. Our first cvs revision of kern_prot.c already contains similar implementation with egid being effectively groups[0]. Probably, some of old-school committers remembered the initial intention of making egid equal to groups[0]? Probably, I have missed something? Thanks a lot! --=20 Stanislav Sedov ST4096-RIPE --Signature=_Thu__7_Jun_2007_21_36_50_+0400_q=oaWEyO7sI7gu3K Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.3 (FreeBSD) iD8DBQFGaEI3K/VZk+smlYERAkwdAJ9Sp8lDY3Pq9ip1bx9M67GR+w+cPgCeI6EK S1nHdh1Q416bECsdbapzk70= =skA0 -----END PGP SIGNATURE----- --Signature=_Thu__7_Jun_2007_21_36_50_+0400_q=oaWEyO7sI7gu3K-- From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 19:36:03 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 4988016A421; Thu, 7 Jun 2007 19:36:03 +0000 (UTC) (envelope-from eric.lemar@isilon.com) Received: from seaxch07.isilon.com (seaxch07.isilon.com [70.103.106.46]) by mx1.freebsd.org (Postfix) with ESMTP id 2607913C45B; Thu, 7 Jun 2007 19:36:03 +0000 (UTC) (envelope-from eric.lemar@isilon.com) x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Thu, 7 Jun 2007 12:31:56 -0700 Message-ID: <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: *at family of syscalls in FreeBSD Thread-Index: Aceo0inqRC+S5FhYQpCTOlQ4A3wpJwAaFiul References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> From: "Eric Lemar" To: "Roman Divacky" Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: arch@freebsd.org Subject: RE: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 19:36:03 -0000 I certainly don't want to hold anything from getting into the release. = I haven't had a chance to compare all the details of your patch and ours, but I've spent a bit of time looking through yours, and I'd say (not surprisingly) that what we've done is much more similar than not = and this is certainly a nice API to have even aside from the linux = compatibility reasons. At least conceptually, most of the differences are relatively minor stylistic differences. We've done the same NDINIT/namei() changes. Rather than have a set of kern_common_* functions, kern_open(), for instance, just calls kern_openat() using AT_FDCWD. kern_openat() has = all the actuall implementaiton. This lets us avoid adding a seperate kern_common_open() and the associated clutter with no real downside that = I can see. Basic pattern is: *kern_open() - calls kern_openat() with AT_FDCWD *kern_openat() - calls a funtion at_getwd() similar to your = kern_get_at *at_getwd() - In addition to your parameters, we also pass in the = flags and path. The flags let us do an isilon specific VOP to get a vp for the subfile container if the user passed in O_XATTR (solaris uses = this to access subfiles and I know linux has at least talked about if not implemented it). We include the path largely to avoid doing work if the path is absolute since the fd is supposed to be ignored in that case. Depending on how tightly you want to tat, you could argue whether it is valid to return an error due to an invalid fd if you pass in an absolute path (I haven't looked at draft posix or actual implementations to see what they do, but we just plain don't touch the fd at all in that case). eric ________________________________ From: Roman Divacky [mailto:rdivacky@FreeBSD.org] Sent: Thu 6/7/2007 12:04 AM To: Doug Barton Cc: Eric Lemar; arch@FreeBSD.org Subject: Re: *at family of syscalls in FreeBSD On Wed, Jun 06, 2007 at 10:38:03AM -0700, Doug Barton wrote: > Roman Divacky wrote: > > > my current patch is at: www.vlakno.cz/~rdivacky/linux_at.patch > > > > it does not implement the native fbsd syscalls, only the linuxulator = ones > > but adding those is a matter of minutes. I asked for a review by pjd = and/or > > rwatson and hopefully this will get commited soon.. > > My recollection of the last round of discussion was that you were > asked to implement these in our base, then wrap the linux versions. If > it's trivial to implement, it should probably be done that way first. well.. my implementation does exactly that :) I changed namei() routine = to check ni_startdir for possible startdir other then CWD and made the = kern_fooat() use it. the change is mostly self-contained (minus prototypes in = sys/sys/syscallsubr.h) in vfs_syscalls. with a very slight change (5 lines?) to vfs_lookup.c + = namei.h its quite simple. I cant comment the Isilon's version. My patch (the = approach) has been OKed by rwatson@ and pjd@. I'd really love to see this go to -current = before RELENG_7 so time is rushing. can someone from Isilon comment their version so we can compare benefits = etc.? roman From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 20:59:26 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C4F7B16A400; Thu, 7 Jun 2007 20:59:26 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from webaccess-cl.virtdom.com (webaccess-cl.virtdom.com [216.240.101.25]) by mx1.freebsd.org (Postfix) with ESMTP id 92F8913C484; Thu, 7 Jun 2007 20:59:26 +0000 (UTC) (envelope-from jroberson@chesapeake.net) Received: from [192.168.1.101] (c-71-231-138-78.hsd1.or.comcast.net [71.231.138.78]) (authenticated bits=0) by webaccess-cl.virtdom.com (8.13.6/8.13.6) with ESMTP id l57KxOW8042368 (version=TLSv1/SSLv3 cipher=DHE-DSS-AES256-SHA bits=256 verify=NO); Thu, 7 Jun 2007 16:59:25 -0400 (EDT) (envelope-from jroberson@chesapeake.net) Date: Thu, 7 Jun 2007 13:59:03 -0700 (PDT) From: Jeff Roberson X-X-Sender: jroberson@10.0.0.1 To: Bruce Evans In-Reply-To: <20070606152352.H606@10.0.0.1> Message-ID: <20070607135511.P606@10.0.0.1> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> <20070606152352.H606@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Attilio Rao , freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 20:59:26 -0000 On Wed, 6 Jun 2007, Jeff Roberson wrote: > On Tue, 5 Jun 2007, Bruce Evans wrote: > >> >> This can probably be fixed more simply by calling rufetch() to reset the >> time state in threads as a side effect. Do this before resetting the >> state in the process. > > Ok, I agree with bde here, just call rufetch and this will clear each thread, > and then you can clear the rux in the proc. > > I'd like to make a list of the remaining problems with rusage and potential > fixes. Then we can decide which ones myself and attilio will resolve > immediately to clean up some of the effect of the sched lock changes. > > 1) The ruadd() in thread_exit() is not safe since we're accessing another > thread's unlocked rusage structure. Potential solution is to allocate p_ru > as part of the proc struct and add into there, which will be protected by the > PROC_SLOCK, which bde seemed to like better anyway. > > 2) We may lose information between exit1() and thread_exit() due to the way > p_ru is initialized before we're done exiting. There also seems to be a race > where wait() operates on a process before it's done in thread_exit() which > means wait may return rusage information without the child added in! The > solution will be to fix this race, and then access p_ru directly in wait(). The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 and 2 as well as the p_runtime iniitialization problem. This moves the collection of child rusage back into exit1() and changes the exiting threads to accumulate their rusage into p_ru under protection of the process spinlock. This also removes the gross lock/unlock of proc slock (formerly sched_lock) from wait and implements something more sensible. Jeff > > 3) There is no locking around rufetch() and calcru(). calcru() may apply > new rux values to an old rusage, giving inaccurate results. The solution is > to either require the proc slock around both calls, or provide a new routine > which does the fetch and calc while grabbing the lock itself. > > 4) libkvm has had the rusage support if'd out because I broke it when I > removed pstats.p_ru. Do we care about rusage in libkvm? Should we go to the > trouble of traversing the list of threads and sum'ing it up? Can we export > some subset of the information? Does anyone use this (other than bde ;). > > Jeff > >> >> ANother minor problem is that proc0_post() isn't the right place to >> reset the time state except for proc0. It hacks on some of the time >> state for all processes and thus for all CPUs, but only resets switchtime >> and switchticks for the current CPU. Resetting of switchtime and >> switchticks for startup of other CPUs now seems to be in sched_throw(). >> I think this is not properly synchronous with the resetting of the >> rest of the state, but the errors from this are tiny. >> >> Bruce >> > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 21:03:17 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5FB4116A400 for ; Thu, 7 Jun 2007 21:03:17 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 19D0313C455 for ; Thu, 7 Jun 2007 21:03:16 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 42D128BDCA9; Thu, 7 Jun 2007 23:03:15 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yZP2x6XoKStS; Thu, 7 Jun 2007 23:03:14 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 385878BDCA0; Thu, 7 Jun 2007 23:03:14 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l57L3DH2000765; Thu, 7 Jun 2007 23:03:13 +0200 (CEST) (envelope-from rdivacky) Date: Thu, 7 Jun 2007 23:03:13 +0200 From: Roman Divacky To: Eric Lemar Message-ID: <20070607210313.GA603@freebsd.org> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> User-Agent: Mutt/1.4.2.3i Cc: arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 21:03:17 -0000 On Thu, Jun 07, 2007 at 12:31:56PM -0700, Eric Lemar wrote: > I certainly don't want to hold anything from getting into the release. I > haven't had a chance to compare all the details of your patch and ours, > but I've spent a bit of time looking through yours, and I'd say > (not surprisingly) that what we've done is much more similar than not and > this is certainly a nice API to have even aside from the linux compatibility > reasons. > > At least conceptually, most of the differences are relatively minor > stylistic differences. We've done the same NDINIT/namei() changes. nice.... > Rather than have a set of kern_common_* functions, kern_open(), for > instance, just calls kern_openat() using AT_FDCWD. kern_openat() has all > the actuall implementaiton. This lets us avoid adding a seperate > kern_common_open() and the associated clutter with no real downside that I > can see. well. its marginally faster :) and I had this OKed by rwatson and pjd. I don't have any strong opinion on this and the fact is that changing it from the model I use to the you suggest is a few minutes job.... I agree that consistency is a strong argument (in favour of your model). > Basic pattern is: > *kern_open() - calls kern_openat() with AT_FDCWD > *kern_openat() - calls a funtion at_getwd() similar to your kern_get_at > *at_getwd() - In addition to your parameters, we also pass in the flags > and path. The flags let us do an isilon specific VOP to get a vp for > the subfile container if the user passed in O_XATTR (solaris uses this > to access subfiles and I know linux has at least talked about if not > implemented it). We include the path largely to avoid doing work if > the path is absolute since the fd is supposed to be ignored in that > case. Depending on how tightly you want to tat, you could > argue whether it is valid to return an error due to an invalid > fd if you pass in an absolute path (I haven't looked at draft posix > or actual implementations to see what they do, but we just > plain don't touch the fd at all in that case). now we need some strong opinion what to do. can anyone step up and tell "do this and that"? I am willing to adjust my patch with either the wrapping idea and/or the flags thing. I just need someone to tell me what is the preferred way. thnx roman From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 21:19:25 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 63D6716A46C; Thu, 7 Jun 2007 21:19:25 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 2619413C44B; Thu, 7 Jun 2007 21:19:25 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.ntplx.net (8.14.1/8.14.1/NETPLEX) with ESMTP id l57LJN3j007796; Thu, 7 Jun 2007 17:19:24 -0400 (EDT) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-3.0 (mail.ntplx.net [204.213.176.10]); Thu, 07 Jun 2007 17:19:24 -0400 (EDT) Date: Thu, 7 Jun 2007 17:19:23 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Roman Divacky In-Reply-To: <20070607210313.GA603@freebsd.org> Message-ID: References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 21:19:25 -0000 On Thu, 7 Jun 2007, Roman Divacky wrote: > > now we need some strong opinion what to do. can anyone step up and tell "do this and > that"? I am willing to adjust my patch with either the wrapping idea and/or the flags thing. > > I just need someone to tell me what is the preferred way. Have you verified that these functions (the way you have named and implemented them) conform to the draft POSIX spec? -- DE From owner-freebsd-arch@FreeBSD.ORG Thu Jun 7 22:04:38 2007 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 20BC716A41F for ; Thu, 7 Jun 2007 22:04:38 +0000 (UTC) (envelope-from mwm-dated-1182116140.3f7bff@mired.org) Received: from mired.org (vpn.mired.org [66.92.153.74]) by mx1.freebsd.org (Postfix) with SMTP id A3EA413C448 for ; Thu, 7 Jun 2007 22:04:35 +0000 (UTC) (envelope-from mwm-dated-1182116140.3f7bff@mired.org) Received: (qmail 53259 invoked by uid 1001); 7 Jun 2007 21:35:40 -0000 Received: by bhuda.mired.org (tmda-sendmail, from uid 1001); Thu, 07 Jun 2007 17:35:39 -0400 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <18024.31275.733694.236655@bhuda.mired.org> Date: Thu, 7 Jun 2007 17:35:39 -0400 To: Stanislav Sedov In-Reply-To: <20070607213650.c02130bf.stas@FreeBSD.org> References: <20070607213650.c02130bf.stas@FreeBSD.org> X-Mailer: VM 7.19 under Emacs 21.3.1 X-Primary-Address: mwm@mired.org X-face: "5Mnwy%?j>IIV\)A=):rjWL~NB2aH[}Yq8Z=u~vJ`"(,&SiLvbbz2W`; h9L,Yg`+vb1>RG% *h+%X^n0EZd>TM8_IB;a8F?(Fb"lw'IgCoyM.[Lg#r\ X-Delivery-Agent: TMDA/1.1.11 (Ladyburn) From: Mike Meyer Cc: freebsd-hackers@FreeBSD.org, timur@gnu.org, freebsd-arch@FreeBSD.org Subject: Re: setegid bug X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 07 Jun 2007 22:04:38 -0000 In <20070607213650.c02130bf.stas@FreeBSD.org>, Stanislav Sedov typed: > Recently several FreeBSD samba users reported a scary problem with > samba (http://bugzilla.samba.org/?id=3990). Further research in > cooperation with Timur Bakeyev (timur) showed, that we have a little > problem with setegid implementation. In FreeBSD (and even in > 4.4BSD-Lite2) egid of the process is merely groups[0], so calling > seteuid function we simply override the first of supplementary groups. > However, POSIX says that not rgid, not any of supplementary groups > should bot be rewritten in setegid call. > > Probably, some of old-school committers remembered the initial > intention of making egid equal to groups[0]? Probably, I have missed > something? The old school in this case is UC Berkeley. I found this behavior in 4.1BSD. Since it lets you violate ass-backwards group security settings (wherein you create a group "undesirables", and have files owned by that group with group bits 0 to keep them out) by removing yourself from that group, I reported it as a security bug to CSRG. Mike's response was that the security model was the bug, not this problem. I suspect it was done that way in the initial implementation, and nobody has ever felt that it should be fixed. http://www.mired.org/consulting.html Independent Network/Unix/Perforce consultant, email for more information. From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 00:56:39 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9A73E16A469; Fri, 8 Jun 2007 00:56:39 +0000 (UTC) (envelope-from eric.lemar@isilon.com) Received: from seaxch07.isilon.com (seaxch07.isilon.com [70.103.106.46]) by mx1.freebsd.org (Postfix) with ESMTP id 7848C13C44B; Fri, 8 Jun 2007 00:56:39 +0000 (UTC) (envelope-from eric.lemar@isilon.com) x-mimeole: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Date: Thu, 7 Jun 2007 17:56:39 -0700 Message-ID: <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: *at family of syscalls in FreeBSD Thread-Index: AcepR0XogPvpUjf7QU+T8Td9MWl5QQAFvrjb References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> From: "Eric Lemar" To: "Roman Divacky" Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: arch@freebsd.org Subject: RE: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 00:56:39 -0000 Obviously I prefer the wrapping, but I'm just a tad biased :) =20 Decided to do a little digging in POSIX-world since (unless others = disagree) getting parameters/behavior right seemed a little more useful than = preparing a patch of another very similar implementation. Unfortunately I didn't = come away that much more enlightened.=20 openat() - Looks like POSIX mentions the use of O_XATTR but doesn't standardize it. On the other hand, it does say that it should fail with EBADF if the path isn't an absolute path AND the fd is invalid, so it seems like it might be safer to check for an absolute path and not try = to access the fd/fail if the path is absolute. There are a number of functions such as fchownat(), chmodat(), = fstatat(), linkat() that are sometimes described as taking a flag field mainly for SYMLINK_FOLLOW/NOFOLLOW or faccessat() that takes an AT_EACCESS to specify effective user/group id. Not clear to me that the question = of which do/don't take flags is actually standard across existing implementations = or necessarily stable in the standard. It's not even completely clear to = me that the naming of some of these (an f prefix or not) is completely = standardized. I haven't really been following this, so if anyone else has I'd be = interested to know. None of these behaviors are particularly hard to change but its not = immediately clear to me what the correct call is on all these at least as far as the = end user API is concerned. unlinkat(), rmdirat() - POSIX doesn't seem to have rmdirat (yes, Isilon has this too). Looks like POSIX just overloads unlinkat() with a new = flags parameter and an AT_REMOVEDIRAT flag for directories. Can't say that's my = favorite API, but if that's where POSIX is going I don't know it's worth bucking the = trend. Eric Lemar ________________________________ From: Roman Divacky [mailto:rdivacky@freebsd.org] Sent: Thu 6/7/2007 2:03 PM To: Eric Lemar Cc: arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD On Thu, Jun 07, 2007 at 12:31:56PM -0700, Eric Lemar wrote: > I certainly don't want to hold anything from getting into the release. = I > haven't had a chance to compare all the details of your patch and = ours, > but I've spent a bit of time looking through yours, and I'd say > (not surprisingly) that what we've done is much more similar than not = and > this is certainly a nice API to have even aside from the linux = compatibility > reasons. > > At least conceptually, most of the differences are relatively minor > stylistic differences. We've done the same NDINIT/namei() changes. nice.... > Rather than have a set of kern_common_* functions, kern_open(), for > instance, just calls kern_openat() using AT_FDCWD. kern_openat() has = all > the actuall implementaiton. This lets us avoid adding a seperate > kern_common_open() and the associated clutter with no real downside = that I > can see. well. its marginally faster :) and I had this OKed by rwatson and pjd. I = don't have any strong opinion on this and the fact is that changing it from = the model I use to the you suggest is a few minutes job.... I agree that = consistency is a strong argument (in favour of your model). > Basic pattern is: > *kern_open() - calls kern_openat() with AT_FDCWD > *kern_openat() - calls a funtion at_getwd() similar to your = kern_get_at > *at_getwd() - In addition to your parameters, we also pass in the = flags > and path. The flags let us do an isilon specific VOP to get a vp = for > the subfile container if the user passed in O_XATTR (solaris uses = this > to access subfiles and I know linux has at least talked about if = not > implemented it). We include the path largely to avoid doing work = if > the path is absolute since the fd is supposed to be ignored in that > case. Depending on how tightly you want to tat, you could > argue whether it is valid to return an error due to an invalid > fd if you pass in an absolute path (I haven't looked at draft posix > or actual implementations to see what they do, but we just > plain don't touch the fd at all in that case). now we need some strong opinion what to do. can anyone step up and tell = "do this and that"? I am willing to adjust my patch with either the wrapping idea = and/or the flags thing. I just need someone to tell me what is the preferred way. thnx roman From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 01:59:06 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AD98316A41F; Fri, 8 Jun 2007 01:59:06 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 693F413C45E; Fri, 8 Jun 2007 01:59:06 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.ntplx.net (8.14.1/8.14.1/NETPLEX) with ESMTP id l581x5ce027755; Thu, 7 Jun 2007 21:59:05 -0400 (EDT) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-3.0 (mail.ntplx.net [204.213.176.10]); Thu, 07 Jun 2007 21:59:05 -0400 (EDT) Date: Thu, 7 Jun 2007 21:59:04 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Eric Lemar In-Reply-To: <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> Message-ID: References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: arch@freebsd.org, Roman Divacky Subject: RE: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 01:59:06 -0000 On Thu, 7 Jun 2007, Eric Lemar wrote: > Obviously I prefer the wrapping, but I'm just a tad biased :) > > Decided to do a little digging in POSIX-world since (unless others disagree) > getting parameters/behavior right seemed a little more useful than preparing > a patch of another very similar implementation. Unfortunately I didn't come away > that much more enlightened. > > openat() - Looks like POSIX mentions the use of O_XATTR but doesn't > standardize it. On the other hand, it does say that it should fail with > EBADF if the path isn't an absolute path AND the fd is invalid, so it > seems like it might be safer to check for an absolute path and not try to > access the fd/fail if the path is absolute. > > There are a number of functions such as fchownat(), chmodat(), fstatat(), > linkat() that are sometimes described as taking a flag field mainly for > SYMLINK_FOLLOW/NOFOLLOW or faccessat() that takes an AT_EACCESS > to specify effective user/group id. Not clear to me that the question of which > do/don't take flags is actually standard across existing implementations or > necessarily stable in the standard. It's not even completely clear to me that > the naming of some of these (an f prefix or not) is completely standardized. > I haven't really been following this, so if anyone else has I'd be interested to know. > None of these behaviors are particularly hard to change but its not immediately > clear to me what the correct call is on all these at least as far as the end user > API is concerned. If we add these functions, we should add them as specified in the latest draft. I doubt the interfaces will change, but perhaps the behavior will change slightly. We _don't_ want to add interfaces that will most likely be incompatible with POSIX. By interfaces, I mean the API. The latest draft I'm looking at is draft 2, issue 7, 31 Oct 2006. You can download a PDF version of the system interfaces draft by registering and logging in here: http://www.opengroup.org/austin/ It looks like draft 3 will be released June 15, 2007 (in 10 days). > unlinkat(), rmdirat() - > POSIX doesn't seem to have rmdirat (yes, Isilon has > this too). Looks like POSIX just overloads unlinkat() with a new flags parameter > and an AT_REMOVEDIRAT flag for directories. Can't say that's my favorite API, > but if that's where POSIX is going I don't know it's worth bucking the trend. Yes, please let's stick the the POSIX API for our own (non-Linux) interfaces. -- DE From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 08:55:46 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7446016A400; Fri, 8 Jun 2007 08:55:46 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from mail04.syd.optusnet.com.au (mail04.syd.optusnet.com.au [211.29.132.185]) by mx1.freebsd.org (Postfix) with ESMTP id E623D13C480; Fri, 8 Jun 2007 08:55:45 +0000 (UTC) (envelope-from brde@optusnet.com.au) Received: from besplex.bde.org (c220-239-235-248.carlnfd3.nsw.optusnet.com.au [220.239.235.248]) by mail04.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id l588tXe3015091 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 8 Jun 2007 18:55:36 +1000 Date: Fri, 8 Jun 2007 18:55:36 +1000 (EST) From: Bruce Evans X-X-Sender: bde@besplex.bde.org To: Jeff Roberson In-Reply-To: <20070607135511.P606@10.0.0.1> Message-ID: <20070608185210.W12107@besplex.bde.org> References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Attilio Rao , freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 08:55:46 -0000 On Thu, 7 Jun 2007, Jeff Roberson wrote: > On Wed, 6 Jun 2007, Jeff Roberson wrote: >> 2) We may lose information between exit1() and thread_exit() due to the >> way p_ru is initialized before we're done exiting. There also seems to be >> a race where wait() operates on a process before it's done in thread_exit() >> which means wait may return rusage information without the child added in! >> The solution will be to fix this race, and then access p_ru directly in >> wait(). I haven't looked at the patch or thought much about the problems yet. A very obvious problem turned up: after killing makeworld with ^C, the times reported are usually almost 0 (even after running 800+ seconds). The times are reasonable if makeworld completes or if a simple foreground process is killed with ^C. This might be (2). Bruce From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 14:10:26 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B001E16A468 for ; Fri, 8 Jun 2007 14:10:26 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.168]) by mx1.freebsd.org (Postfix) with ESMTP id 1410A13C44B for ; Fri, 8 Jun 2007 14:10:25 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by ug-out-1314.google.com with SMTP id u2so1095531uge for ; Fri, 08 Jun 2007 07:10:25 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=t9kejLvD/PqRW621ODbNLIhQjf1rO/oxOou1mN3K25eE0XG6JzMfjZeOi4uzgqvgtPGCwZo9hkcGUnsTxbQF7TxqJWYIAPMdWu9mcE2JiSZLQiwfoGw0bHtItsPQAN1K8zHlTK8uD8TTB7ubd4EP6N/FFk6NBAE8xXELM6QHvLU= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:reply-to:user-agent:mime-version:to:cc:subject:references:in-reply-to:content-type:content-transfer-encoding:sender; b=jDF8b/6wpJ/oj2N1T4OScYZmeNMfpd/Y6vAy12b9olaOU9av1OgU8LZC1cWlxZcTxBH0YmkqligHfuOFyR5MqW59Y6OVt5mphv7tpc5EBP6OxfR/SrBEfBkqcDqe5LEVTPbxpE/vn1uRAYQw1Q4d9SAqizMYoLNeX1h+Jq7odps= Received: by 10.82.189.6 with SMTP id m6mr5447021buf.1181311824737; Fri, 08 Jun 2007 07:10:24 -0700 (PDT) Received: from ?172.31.5.25? ( [89.97.252.178]) by mx.google.com with ESMTP id h6sm576509nfh.2007.06.08.07.10.23 (version=TLSv1/SSLv3 cipher=RC4-MD5); Fri, 08 Jun 2007 07:10:24 -0700 (PDT) Message-ID: <4669633D.9090809@FreeBSD.org> Date: Fri, 08 Jun 2007 16:10:05 +0200 From: Attilio Rao User-Agent: Thunderbird 1.5 (X11/20060526) MIME-Version: 1.0 To: Jeff Roberson References: <20070529105856.L661@10.0.0.1> <200705291456.38515.jhb@freebsd.org> <20070529121653.P661@10.0.0.1> <20070530065423.H93410@delplex.bde.org> <20070529141342.D661@10.0.0.1> <20070530125553.G12128@besplex.bde.org> <20070529201255.X661@10.0.0.1> <20070529220936.W661@10.0.0.1> <20070530201618.T13220@besplex.bde.org> <20070530115752.F661@10.0.0.1> <20070531091419.S826@besplex.bde.org> <20070531010631.N661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1> In-Reply-To: <20070607135511.P606@10.0.0.1> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: Attilio Rao Cc: freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: attilio@FreeBSD.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 14:10:26 -0000 Jeff Roberson wrote: > On Wed, 6 Jun 2007, Jeff Roberson wrote: > >> On Tue, 5 Jun 2007, Bruce Evans wrote: >> >>> >>> This can probably be fixed more simply by calling rufetch() to reset the >>> time state in threads as a side effect. Do this before resetting the >>> state in the process. >> >> Ok, I agree with bde here, just call rufetch and this will clear each >> thread, and then you can clear the rux in the proc. >> >> I'd like to make a list of the remaining problems with rusage and >> potential fixes. Then we can decide which ones myself and attilio >> will resolve immediately to clean up some of the effect of the sched >> lock changes. >> >> 1) The ruadd() in thread_exit() is not safe since we're accessing >> another thread's unlocked rusage structure. Potential solution is to >> allocate p_ru as part of the proc struct and add into there, which >> will be protected by the PROC_SLOCK, which bde seemed to like better >> anyway. >> >> 2) We may lose information between exit1() and thread_exit() due to >> the way p_ru is initialized before we're done exiting. There also >> seems to be a race where wait() operates on a process before it's done >> in thread_exit() which means wait may return rusage information >> without the child added in! The solution will be to fix this race, >> and then access p_ru directly in wait(). > > The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 > and 2 as well as the p_runtime iniitialization problem. This moves the > collection of child rusage back into exit1() and changes the exiting > threads to accumulate their rusage into p_ru under protection of the > process spinlock. This also removes the gross lock/unlock of proc slock > (formerly sched_lock) from wait and implements something more sensible. > > Jeff > >> >> 3) There is no locking around rufetch() and calcru(). calcru() may >> apply new rux values to an old rusage, giving inaccurate results. The >> solution is to either require the proc slock around both calls, or >> provide a new routine which does the fetch and calc while grabbing the >> lock itself. And this should fix (3): http://users.gufi.org/~rookie/works/patches/schedlock/rusage2.diff (and reorders rucollect() declaration sorted by name). A thought: Shouldn't we actually remove in calcru() (and rufetchcalc()) the copy to the rux object? When sched_lock was there it would be useful since it had a lot of contention, but now that the per-proc spinlock is protecting those fields it is useless. And consider that calcru1() has no locking inside (so you won't expect particulary long execution times). Thanks, Attilio From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 16:13:26 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DD68416A400 for ; Fri, 8 Jun 2007 16:13:26 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 6335513C465 for ; Fri, 8 Jun 2007 16:13:25 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id D56198BDCB9; Fri, 8 Jun 2007 18:13:24 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZHgnY1kKI12L; Fri, 8 Jun 2007 18:13:23 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 93CC68BD7CA; Fri, 8 Jun 2007 18:13:23 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l58GDM8a027868; Fri, 8 Jun 2007 18:13:22 +0200 (CEST) (envelope-from rdivacky) Date: Fri, 8 Jun 2007 18:13:22 +0200 From: Roman Divacky To: Eric Lemar Message-ID: <20070608161322.GA27624@freebsd.org> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> User-Agent: Mutt/1.4.2.3i Cc: arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 16:13:26 -0000 On Thu, Jun 07, 2007 at 05:56:39PM -0700, Eric Lemar wrote: > Obviously I prefer the wrapping, but I'm just a tad biased :) well.. unless I hear some strong voice to change what I have I am not changing it. it can always be changed in future. > Decided to do a little digging in POSIX-world since (unless others disagree) > getting parameters/behavior right seemed a little more useful than preparing > a patch of another very similar implementation. Unfortunately I didn't come away > that much more enlightened. > > openat() - Looks like POSIX mentions the use of O_XATTR but doesn't > standardize it. On the other hand, it does say that it should fail with > EBADF if the path isn't an absolute path AND the fd is invalid, so it > seems like it might be safer to check for an absolute path and not try to > access the fd/fail if the path is absolute. I don't understand now. you are saying that if the path is relative and the fd is invalid we should do what? we have to fail somehow... > There are a number of functions such as fchownat(), chmodat(), fstatat(), > linkat() that are sometimes described as taking a flag field mainly for > SYMLINK_FOLLOW/NOFOLLOW or faccessat() that takes an AT_EACCESS > to specify effective user/group id. Not clear to me that the question of which > do/don't take flags is actually standard across existing implementations or > necessarily stable in the standard. It's not even completely clear to me that > the naming of some of these (an f prefix or not) is completely standardized. > I haven't really been following this, so if anyone else has I'd be interested to know. > None of these behaviors are particularly hard to change but its not immediately > clear to me what the correct call is on all these at least as far as the end user > API is concerned. linux implements flags too. I think we want them. the current practice is that we have for example kern_chown and and kern_lchown. I implemented it like this for a few syscalls (from the top of my head - [l]stat and l[chown]) > unlinkat(), rmdirat() - > POSIX doesn't seem to have rmdirat (yes, Isilon has > this too). Looks like POSIX just overloads unlinkat() with a new flags parameter > and an AT_REMOVEDIRAT flag for directories. Can't say that's my favorite API, > but if that's where POSIX is going I don't know it's worth bucking the trend. well, I think we are confusing two things here. in-kernel API and syscalls API. I think it makes perfect sense to have kern_rmdirat() and in ulinkat() do something like if (flags->AT_REMOVEDIRAT) kern_rmdirat(...); else kern_unlinkat(....); note that I didnt implement ANY syscall only the kern_fooat() functions. From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 16:15:25 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F26DA16A469 for ; Fri, 8 Jun 2007 16:15:25 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id A4E6D13C487 for ; Fri, 8 Jun 2007 16:15:25 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id E3B868BDD4A; Fri, 8 Jun 2007 18:15:24 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id inznBKhcco8m; Fri, 8 Jun 2007 18:15:23 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id AF6EA8BD7CA; Fri, 8 Jun 2007 18:15:23 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l58GFN8J027920; Fri, 8 Jun 2007 18:15:23 +0200 (CEST) (envelope-from rdivacky) Date: Fri, 8 Jun 2007 18:15:23 +0200 From: Roman Divacky To: Daniel Eischen Message-ID: <20070608161523.GB27624@freebsd.org> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 16:15:26 -0000 > It looks like draft 3 will be released June 15, 2007 (in 10 days). is it ok to have this committed after June 15 (afaik the branching day)? maybe without the native syscalls but with the kern_fooat() backend. (ie. *at syscalls can be added for 7.1R with a few lines patch) roman From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 16:23:50 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id BA4A916A4E2; Fri, 8 Jun 2007 16:23:50 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 7A32A13C468; Fri, 8 Jun 2007 16:23:50 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.ntplx.net (8.14.1/8.14.1/NETPLEX) with ESMTP id l58GNnpQ029527; Fri, 8 Jun 2007 12:23:49 -0400 (EDT) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-3.0 (mail.ntplx.net [204.213.176.10]); Fri, 08 Jun 2007 12:23:49 -0400 (EDT) Date: Fri, 8 Jun 2007 12:23:49 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Roman Divacky In-Reply-To: <20070608161523.GB27624@freebsd.org> Message-ID: References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> <20070608161523.GB27624@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 16:23:50 -0000 On Fri, 8 Jun 2007, Roman Divacky wrote: >> It looks like draft 3 will be released June 15, 2007 (in 10 days). > > is it ok to have this committed after June 15 (afaik the branching day)? maybe > without the native syscalls but with the kern_fooat() backend. (ie. *at syscalls > can be added for 7.1R with a few lines patch) I don't have any objection over adding linux compat functionality. I just don't want us to add native functions that don't conform to POSIX, mostly the API is what I am concerned about. We can change the behavior slightly to conform with whatever POSIX dictates, but we shouldn't knowingly introduce non-conforming APIs (because once 7.0 is released, we'll would always have to support both the non-conforming APIs as well as adding and supporting the conforming APIs). -- DE From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 16:34:06 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id AE76916A41F; Fri, 8 Jun 2007 16:34:06 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id 393C413C469; Fri, 8 Jun 2007 16:34:06 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from localhost.corp.yahoo.com (john@localhost [127.0.0.1]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l58GXu1m057898; Fri, 8 Jun 2007 12:33:56 -0400 (EDT) (envelope-from jhb@freebsd.org) From: John Baldwin To: freebsd-arch@freebsd.org Date: Fri, 8 Jun 2007 12:22:31 -0400 User-Agent: KMail/1.9.6 References: <20070529105856.L661@10.0.0.1> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1> In-Reply-To: <20070607135511.P606@10.0.0.1> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200706081222.32457.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [127.0.0.1]); Fri, 08 Jun 2007 12:33:56 -0400 (EDT) X-Virus-Scanned: ClamAV 0.88.3/3380/Fri Jun 8 08:34:26 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-4.4 required=4.2 tests=ALL_TRUSTED,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: Attilio Rao Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 16:34:06 -0000 On Thursday 07 June 2007 04:59:03 pm Jeff Roberson wrote: > On Wed, 6 Jun 2007, Jeff Roberson wrote: > > > On Tue, 5 Jun 2007, Bruce Evans wrote: > > > >> > >> This can probably be fixed more simply by calling rufetch() to reset the > >> time state in threads as a side effect. Do this before resetting the > >> state in the process. > > > > Ok, I agree with bde here, just call rufetch and this will clear each thread, > > and then you can clear the rux in the proc. > > > > I'd like to make a list of the remaining problems with rusage and potential > > fixes. Then we can decide which ones myself and attilio will resolve > > immediately to clean up some of the effect of the sched lock changes. > > > > 1) The ruadd() in thread_exit() is not safe since we're accessing another > > thread's unlocked rusage structure. Potential solution is to allocate p_ru > > as part of the proc struct and add into there, which will be protected by the > > PROC_SLOCK, which bde seemed to like better anyway. > > > > 2) We may lose information between exit1() and thread_exit() due to the way > > p_ru is initialized before we're done exiting. There also seems to be a race > > where wait() operates on a process before it's done in thread_exit() which > > means wait may return rusage information without the child added in! The > > solution will be to fix this race, and then access p_ru directly in wait(). > > The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 > and 2 as well as the p_runtime iniitialization problem. This moves the > collection of child rusage back into exit1() and changes the exiting > threads to accumulate their rusage into p_ru under protection of the > process spinlock. This also removes the gross lock/unlock of proc slock > (formerly sched_lock) from wait and implements something more sensible. I think the comment explaining the race still needs to be there for future code readers so they don't remove the locking. I also don't see what you gain by moving the lock earlier. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 16:48:20 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 45B8416A421; Fri, 8 Jun 2007 16:48:20 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id EF02713C4C2; Fri, 8 Jun 2007 16:48:19 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id D7A138BDD4A; Fri, 8 Jun 2007 18:48:18 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id R4mZdMmdOKxB; Fri, 8 Jun 2007 18:48:17 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id BC33A8BDCE4; Fri, 8 Jun 2007 18:48:17 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l58GmHJb028636; Fri, 8 Jun 2007 18:48:17 +0200 (CEST) (envelope-from rdivacky) Date: Fri, 8 Jun 2007 18:48:17 +0200 From: Roman Divacky To: Daniel Eischen Message-ID: <20070608164817.GA28549@freebsd.org> References: <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> <20070608161523.GB27624@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 16:48:20 -0000 On Fri, Jun 08, 2007 at 12:23:49PM -0400, Daniel Eischen wrote: > On Fri, 8 Jun 2007, Roman Divacky wrote: > > >>It looks like draft 3 will be released June 15, 2007 (in 10 days). > > > >is it ok to have this committed after June 15 (afaik the branching day)? > >maybe > >without the native syscalls but with the kern_fooat() backend. (ie. *at > >syscalls > >can be added for 7.1R with a few lines patch) > > I don't have any objection over adding linux compat functionality. the linux compat thingie needs those kern_fooat(). > I just don't want us to add native functions that don't conform > to POSIX, mostly the API is what I am concerned about. We can > change the behavior slightly to conform with whatever POSIX > dictates, but we shouldn't knowingly introduce non-conforming > APIs (because once 7.0 is released, we'll would always have to > support both the non-conforming APIs as well as adding and > supporting the conforming APIs). I have NOT implemented a single bit of native syscalls API and when I am going to do it it will be 100% posix API (minus bugs ;) ) we can commit this in two phases: phase I: kern_fooat() + linux stuff phase II: native fbsd syscalls I hope to resolve all the issues Eric raised over the weekend (hopefully) and then it only needs a review(er) + a commiter I definitely want this in for 7.0R. does this sound good to you? roman From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 16:52:09 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9AFEA16A479; Fri, 8 Jun 2007 16:52:09 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from mail.ntplx.net (mail.ntplx.net [204.213.176.10]) by mx1.freebsd.org (Postfix) with ESMTP id 58FF613C4CB; Fri, 8 Jun 2007 16:52:09 +0000 (UTC) (envelope-from deischen@freebsd.org) Received: from sea.ntplx.net (sea.ntplx.net [204.213.176.11]) by mail.ntplx.net (8.14.1/8.14.1/NETPLEX) with ESMTP id l58Gq7WK018301; Fri, 8 Jun 2007 12:52:07 -0400 (EDT) X-Virus-Scanned: by AMaViS and Clam AntiVirus (mail.ntplx.net) X-Greylist: Message whitelisted by DRAC access database, not delayed by milter-greylist-3.0 (mail.ntplx.net [204.213.176.10]); Fri, 08 Jun 2007 12:52:08 -0400 (EDT) Date: Fri, 8 Jun 2007 12:52:07 -0400 (EDT) From: Daniel Eischen X-X-Sender: eischen@sea.ntplx.net To: Roman Divacky In-Reply-To: <20070608164817.GA28549@freebsd.org> Message-ID: References: <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> <20070608161523.GB27624@freebsd.org> <20070608164817.GA28549@freebsd.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Daniel Eischen List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 16:52:09 -0000 On Fri, 8 Jun 2007, Roman Divacky wrote: > On Fri, Jun 08, 2007 at 12:23:49PM -0400, Daniel Eischen wrote: >> I just don't want us to add native functions that don't conform >> to POSIX, mostly the API is what I am concerned about. We can >> change the behavior slightly to conform with whatever POSIX >> dictates, but we shouldn't knowingly introduce non-conforming >> APIs (because once 7.0 is released, we'll would always have to >> support both the non-conforming APIs as well as adding and >> supporting the conforming APIs). > > I have NOT implemented a single bit of native syscalls API and when I am > going to do it it will be 100% posix API (minus bugs ;) ) > > we can commit this in two phases: > > phase I: kern_fooat() + linux stuff > phase II: native fbsd syscalls > > I hope to resolve all the issues Eric raised over the weekend (hopefully) and > then it only needs a review(er) + a commiter > > I definitely want this in for 7.0R. > > does this sound good to you? No objection here :-) -- DE From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 17:26:36 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6767A16A400 for ; Fri, 8 Jun 2007 17:26:36 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from ug-out-1314.google.com (ug-out-1314.google.com [66.249.92.168]) by mx1.freebsd.org (Postfix) with ESMTP id EE1F813C447 for ; Fri, 8 Jun 2007 17:26:35 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by ug-out-1314.google.com with SMTP id u2so1143527uge for ; Fri, 08 Jun 2007 10:26:35 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=W0NXU3dBiLdm0R9Ccpb/92wmcukGxdLr/8PkT/1DkctUuI6FKiO0Lqf2Wq5IkN0baIlASUhJlCiVXnM+k2bMZCJ83WW5ZBOTpP3vZ+sHZtMnrri471ppl3c2vKpyjQsCA92qq3/1Ot+dGaDsp6yquBZsxd3LwtAYgM2FfiistV8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=ZwsMLPTHYEI6nv4rb8hquedEycg0BwDxe0jaQqW+KTgZzYQA46IsO8a46ZvF07aXFxOxd6UpdffzxEsLbBWvOgWq5nhnPHhjvzV5DeCApvXQho3hgkwmEI649q6rytEUk8RaFyi/F9WLOabOzaqVYTu+cYJDf074us1XqFt/Ah4= Received: by 10.78.156.6 with SMTP id d6mr1349499hue.1181323595444; Fri, 08 Jun 2007 10:26:35 -0700 (PDT) Received: by 10.78.120.9 with HTTP; Fri, 8 Jun 2007 10:26:35 -0700 (PDT) Message-ID: <3bbf2fe10706081026l27bef70pd2d1d32c7e57d442@mail.gmail.com> Date: Fri, 8 Jun 2007 19:26:35 +0200 From: "Attilio Rao" Sender: asmrookie@gmail.com To: "Jeff Roberson" In-Reply-To: <20070607135511.P606@10.0.0.1> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20070529105856.L661@10.0.0.1> <20070601154833.O4207@besplex.bde.org> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1> X-Google-Sender-Auth: 2f9a5fccf838ea54 Cc: freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 17:26:36 -0000 2007/6/7, Jeff Roberson : > The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 > and 2 as well as the p_runtime iniitialization problem. This moves the > collection of child rusage back into exit1() and changes the exiting > threads to accumulate their rusage into p_ru under protection of the > process spinlock. This also removes the gross lock/unlock of proc slock > (formerly sched_lock) from wait and implements something more sensible. I have a question: it is fair to assume that extra per-proc spinlock acquisitions/removals on the PRS_ZOMBIE state are orthogonal to this problem? They should belong to another 'fix', shouldn't? Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 17:55:01 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 3936416A41F for ; Fri, 8 Jun 2007 17:55:01 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: from wr-out-0506.google.com (wr-out-0506.google.com [64.233.184.230]) by mx1.freebsd.org (Postfix) with ESMTP id D08B413C447 for ; Fri, 8 Jun 2007 17:55:00 +0000 (UTC) (envelope-from asmrookie@gmail.com) Received: by wr-out-0506.google.com with SMTP id 69so394482wra for ; Fri, 08 Jun 2007 10:55:00 -0700 (PDT) DKIM-Signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=MtYpym+hDIlF3PKv/8MBI1mJuwk0V0v6KVmVyCflKIc/99mp34dU4gm+oo0ZrKPvGJfqfM99rDWhVXF35yP8UyBPcVSpU0y9ak20Rtq3dU87jdNyl5ivOt3xzjCki/CXP6bUoVlibaJYqXYg2kL83m1alTnAoLftXnPAngAl3fE= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references:x-google-sender-auth; b=hoe4E06rbdATNjdReqTXdmwDFkIwuHs8a2vuayTQkKA85/dulG24YyPnM6xdbPzTjwoSWy3yvdwNrH5XW3PNdR3TMFHqx5d/0MZvuKLOEMV7rLC6fSTurKO7yhQHWiXvCrxivAjcHuZGeI6f6MMjz8dsttZM1WGyVeRTsk1bMag= Received: by 10.78.172.20 with SMTP id u20mr1364321hue.1181325299463; Fri, 08 Jun 2007 10:54:59 -0700 (PDT) Received: by 10.78.120.9 with HTTP; Fri, 8 Jun 2007 10:54:59 -0700 (PDT) Message-ID: <3bbf2fe10706081054ob030862u58c123814510398@mail.gmail.com> Date: Fri, 8 Jun 2007 19:54:59 +0200 From: "Attilio Rao" Sender: asmrookie@gmail.com To: "Jeff Roberson" In-Reply-To: <3bbf2fe10706081026l27bef70pd2d1d32c7e57d442@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20070529105856.L661@10.0.0.1> <20070601014601.I799@10.0.0.1> <20070601200348.G6201@delplex.bde.org> <20070601123530.B606@10.0.0.1> <20070604160036.N1084@besplex.bde.org> <46652D17.5090903@FreeBSD.org> <20070605214404.X47001@delplex.bde.org> <20070606152352.H606@10.0.0.1> <20070607135511.P606@10.0.0.1> <3bbf2fe10706081026l27bef70pd2d1d32c7e57d442@mail.gmail.com> X-Google-Sender-Auth: 615fa2433c3aefd3 Cc: freebsd-arch@freebsd.org Subject: Re: Updated rusage patch X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 17:55:01 -0000 2007/6/8, Attilio Rao : > 2007/6/7, Jeff Roberson : > > The patch at http://people.freebsd.org/~jeff/rusage3.diff fixes points 1 > > and 2 as well as the p_runtime iniitialization problem. This moves the > > collection of child rusage back into exit1() and changes the exiting > > threads to accumulate their rusage into p_ru under protection of the > > process spinlock. This also removes the gross lock/unlock of proc slock > > (formerly sched_lock) from wait and implements something more sensible. > > I have a question: > it is fair to assume that extra per-proc spinlock > acquisitions/removals on the PRS_ZOMBIE state are orthogonal to this > problem? They should belong to another 'fix', shouldn't? Mm, now I see that you could protect nicely PRS_ZOMBIE through PROC_LOCK since p_state is marked (j/c), no? (it is alredy acquired when checking for it). Attilio -- Peace can only be achieved by understanding - A. Einstein From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 15:54:47 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id F004516A469 for ; Fri, 8 Jun 2007 15:54:47 +0000 (UTC) (envelope-from pfgshield-freebsd@yahoo.com) Received: from web32715.mail.mud.yahoo.com (web32715.mail.mud.yahoo.com [68.142.206.28]) by mx1.freebsd.org (Postfix) with SMTP id 9323613C447 for ; Fri, 8 Jun 2007 15:54:47 +0000 (UTC) (envelope-from pfgshield-freebsd@yahoo.com) Received: (qmail 72642 invoked by uid 60001); 8 Jun 2007 15:28:07 -0000 DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:Date:From:Reply-To:Subject:To:Cc:MIME-Version:Content-Type:Content-Transfer-Encoding:Message-ID; b=PV1gBvFrQE5/viE6vK9F9Zp1mr5nnc/TlMf0v1hM4Q3G3fyH4eTeI1UXvtOkEcjCBkrRY2Wlkm/YG+NQvUZiEnvC3KmP5gLhSQ7ojOGjltSfAx1xpPcZExVOZMTwbW7GwNPvjOfPivhf+al3ovP9qXGGgfJIXiSPklY/IHt6Dog=; X-YMail-OSG: XLxRDjcVM1leepkY5vpXbtRMFhBH9GjbsW9ozqmO67rhYiq.65c8ejPa9Dgk0Rt3GivifUPMpp737HfDP8Vp47bC6Z76bhEon3sJusPFWnTgKhxEgik- Received: from [200.118.173.177] by web32715.mail.mud.yahoo.com via HTTP; Fri, 08 Jun 2007 17:28:07 CEST Date: Fri, 8 Jun 2007 17:28:07 +0200 (CEST) From: To: freebsd-arch@freebsd.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Message-ID: <215400.72345.qm@web32715.mail.mud.yahoo.com> X-Mailman-Approved-At: Fri, 08 Jun 2007 18:19:45 +0000 Cc: Daniel Eischen , Eric Lemar Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: pfgshield-freebsd@yahoo.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 15:54:48 -0000 Hello; Perhaps FreeBSD should take the same cautious approach of Solaris. Quoting Casper at: http://www.opensolaris.org/jive/thread.jspa?threadID=31708&tstart=0 " ... Not until it's clear there is a consensus about the functions. Some of these had already been added in Solaris 2.6, I think, where they originate. For now we have: fchownat fstatat futimesat unlinkat openat __accessat (which was added waiting for the final POSIX standard) renameat These functions are relatively straightforward to implement in the kernel and it is probably best to implement what is missing as private functions until the standard has gravitated toward consensus. There's some discussion about the 'f' argument." ___________________________________ L'email della prossima generazione? Puoi averla con la nuova Yahoo! Mail: http://it.docs.yahoo.com/nowyoucan.html From owner-freebsd-arch@FreeBSD.ORG Fri Jun 8 21:12:06 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8719C16A46F for ; Fri, 8 Jun 2007 21:12:06 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.freebsd.org (Postfix) with ESMTP id 444E413C44C for ; Fri, 8 Jun 2007 21:12:06 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id EF9F520A5; Fri, 8 Jun 2007 22:56:27 +0200 (CEST) X-Spam-Tests: AWL X-Spam-Learn: disabled X-Spam-Score: 0.0/3.0 X-Spam-Checker-Version: SpamAssassin 3.2.0 (2007-05-01) on tim.des.no Received: from dwp.des.no (des.no [80.203.243.180]) by smtp.des.no (Postfix) with ESMTP id 827F320A4; Fri, 8 Jun 2007 22:56:27 +0200 (CEST) Received: by dwp.des.no (Postfix, from userid 1001) id 8343D57C1; Fri, 8 Jun 2007 22:56:38 +0200 (CEST) From: =?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?= To: Roman Divacky References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> <20070608161322.GA27624@freebsd.org> Date: Fri, 08 Jun 2007 22:56:38 +0200 In-Reply-To: <20070608161322.GA27624@freebsd.org> (Roman Divacky's message of "Fri\, 8 Jun 2007 18\:13\:22 +0200") Message-ID: <86myzadv21.fsf@dwp.des.no> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 Jun 2007 21:12:06 -0000 Roman Divacky writes: > On Thu, Jun 07, 2007 at 05:56:39PM -0700, Eric Lemar wrote: > > Obviously I prefer the wrapping, but I'm just a tad biased :) > well.. unless I hear some strong voice to change what I have I am not > changing it. it can always be changed in future. I would strongly urge you to consider following Eric's suggestion; it is conceptually far cleaner. DES --=20 Dag-Erling Sm=C3=B8rgrav - des@des.no From owner-freebsd-arch@FreeBSD.ORG Sat Jun 9 10:36:13 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7646C16A41F for ; Sat, 9 Jun 2007 10:36:13 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from vlakno.cz (vlk.vlakno.cz [62.168.28.247]) by mx1.freebsd.org (Postfix) with ESMTP id 3016D13C447 for ; Sat, 9 Jun 2007 10:36:12 +0000 (UTC) (envelope-from rdivacky@vlk.vlakno.cz) Received: from localhost (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 958898BDE41; Sat, 9 Jun 2007 12:36:11 +0200 (CEST) X-Virus-Scanned: amavisd-new at vlakno.cz Received: from vlakno.cz ([127.0.0.1]) by localhost (vlk.vlakno.cz [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kJX+qFWiMU0o; Sat, 9 Jun 2007 12:36:10 +0200 (CEST) Received: from vlk.vlakno.cz (localhost [127.0.0.1]) by vlakno.cz (Postfix) with ESMTP id 8833D8BDE09; Sat, 9 Jun 2007 12:36:10 +0200 (CEST) Received: (from rdivacky@localhost) by vlk.vlakno.cz (8.13.8/8.13.8/Submit) id l59Aa9Xl052325; Sat, 9 Jun 2007 12:36:09 +0200 (CEST) (envelope-from rdivacky) Date: Sat, 9 Jun 2007 12:36:09 +0200 From: Roman Divacky To: Dag-Erling Sm??rgrav Message-ID: <20070609103609.GA52234@freebsd.org> References: <20070604162430.GA76813@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A7F@seaxch07.desktop.isilon.com> <20070606074429.GA42032@freebsd.org> <4666F0FB.8020101@FreeBSD.org> <20070607070455.GA71012@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A84@seaxch07.desktop.isilon.com> <20070607210313.GA603@freebsd.org> <896DB1FBFFD5A145833D9DA08CA12A85051A87@seaxch07.desktop.isilon.com> <20070608161322.GA27624@freebsd.org> <86myzadv21.fsf@dwp.des.no> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <86myzadv21.fsf@dwp.des.no> User-Agent: Mutt/1.4.2.3i Cc: Eric Lemar , arch@freebsd.org Subject: Re: *at family of syscalls in FreeBSD X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jun 2007 10:36:13 -0000 On Fri, Jun 08, 2007 at 10:56:38PM +0200, Dag-Erling Sm??rgrav wrote: > Roman Divacky writes: > > On Thu, Jun 07, 2007 at 05:56:39PM -0700, Eric Lemar wrote: > > > Obviously I prefer the wrapping, but I'm just a tad biased :) > > well.. unless I hear some strong voice to change what I have I am not > > changing it. it can always be changed in future. > > I would strongly urge you to consider following Eric's suggestion; it is > conceptually far cleaner. ok, that was the strong voice I wanted to hear :) the patch is here: www.vlakno.cz/~rdivacky/linux_at.patch changes: I consistently use the mode of having kern_foo() { return kern_fooat(..., AT_FDCWD); } and kern_fooat() being the complete syscall calling the kern_get_at(). There is no kern_common_foo() anymore. the only comment from Eric now is the flags parameter but I don't think we should implement it right now as we don't have any use for it so the patch is ok as it is. we will add the flags parameter once we implement some functionality using it. comments? roman From owner-freebsd-arch@FreeBSD.ORG Sat Jun 9 08:36:58 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5B0FA16A400; Sat, 9 Jun 2007 08:36:58 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.freebsd.org (Postfix) with ESMTP id 358BF13C447; Sat, 9 Jun 2007 08:36:58 +0000 (UTC) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.13.8/8.13.7) with ESMTP id l598Pv2q020258; Sat, 9 Jun 2007 01:25:57 -0700 (PDT) Received: (from dillon@localhost) by apollo.backplane.com (8.13.8/8.13.4/Submit) id l598Pval020257; Sat, 9 Jun 2007 01:25:57 -0700 (PDT) Date: Sat, 9 Jun 2007 01:25:57 -0700 (PDT) From: Matthew Dillon Message-Id: <200706090825.l598Pval020257@apollo.backplane.com> To: Jeff Roberson References: <20070604220649.E606@10.0.0.1> X-Mailman-Approved-At: Sat, 09 Jun 2007 11:23:15 +0000 Cc: kmacy@freebsd.org, benno@freebsd.org, marius@freebsd.org, marcl@freebsd.org, arch@freebsd.org, jake@freebsd.org, tmm@freebsd.org, cognet@freebsd.org, grehan@freebsd.org Subject: Re: New cpu_switch() and cpu_throw(). X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Jun 2007 08:36:58 -0000 I haven't read your code to see what you've done exactly but I had a similar issue in DragonFly and I solved it by having a bit of code that set a bit in the previous thread's flags AFTER having switched to the new thread. e.g. the switch restore code for the new thread (%eax) is also responsible for cleaning up the old thread (%ebx): /* * Clear TDF_RUNNING flag in old thread only after cleaning up * %cr3. The target thread is already protected by being TDF_RUNQ * so setting TDF_RUNNING isn't as big a deal. */ andl $~TDF_RUNNING,TD_FLAGS(%ebx) orl $TDF_RUNNING,TD_FLAGS(%eax) (from /usr/src/sys/platform/pc32/i386/swtch.s on DragonFly) The exit code can then interlock on TDF_RUNNING without there being a race against the old thread's stack. The condition occured so rarely I didn't even bother using a lock... I just test it in exit (well, really the reaper) and sleep for one tick in a loop if the race was detected. The same feature is used for thread migration between cpu's... the target cpu gets an IPI message with the thread being migrated, and spins waiting for TDF_RUNNING to clear, indicating that the thread has been completely switched out by the originating cpu. FreeBSD uses a more complex arrangement so it might not be applicable, but it is still a nice trick. -Matt