From owner-freebsd-arch Sun Sep 15 8:58:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 94B4637B408 for ; Sun, 15 Sep 2002 08:58:54 -0700 (PDT) Received: from logger.o2.pl (logger.o2.pl [212.126.20.38]) by mx1.FreeBSD.org (Postfix) with ESMTP id 34F4443E6E for ; Sun, 15 Sep 2002 08:58:53 -0700 (PDT) (envelope-from gfege@o2.pl) Received: from localhost (unknown [62.233.167.10]) by logger.o2.pl (Mailer_v2.01) with ESMTP id 04C111C41D9 for ; Sun, 15 Sep 2002 17:58:52 +0200 (CEST) X-Sender: gfege@o2.pl From: gfege@o2.pl To: freebsd-arch@FreeBSD.org Date: Sun, 15 Sep 2002 17:59:07 +0200 Subject: filmy divixy !!! Reply-To: gfege@o2.pl MIME-Version: 1.0 Content-Type: text/plain; charset=windows-1250 Content-Transfer-Encoding: 8bit Message-Id: <20020915155852.04C111C41D9@logger.o2.pl> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG DIVIXY : http://www.geocities.com/rekrutacja35/index.html Jeśli nie ma w spisie na stronie filmów które Cię interesują nic nie szkodzi! Można je zamówić mailem w przeciągu 7 dni roboczych zostanie dostarczona przesyłka !!! Wypalam także płytki audio z twoją ulubioną muzyką !!! 10 zł za płytkę płatne z góry. Wysyłka listem expres taniej i szybciej ! If you do not find the movie you are interested in, do not worry !! Send me an email and during 7 business days I will get a a package with the required movie !!! U want to have ur favourite music on CD let me know by email ! 10 PLN per CD. Shipment thru regular email. Faster and cheaper ! Zainteresowany usługami tłumaczenia ? 1500 znaków 16 zł. Niemiecki Angielski R u interested in translation services ? 1500 characters 16 PLN. German English. info : tradeinfo@o2.pl To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sun Sep 15 13:43:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AEE8637B400; Sun, 15 Sep 2002 13:43:38 -0700 (PDT) Received: from procyon.firepipe.net (procyon.firepipe.net [198.78.66.151]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A76F43E6E; Sun, 15 Sep 2002 13:43:38 -0700 (PDT) (envelope-from will@csociety.org) Received: by procyon.firepipe.net (Postfix, from userid 1000) id 9706923DFD; Sun, 15 Sep 2002 13:43:16 -0700 (PDT) Date: Sun, 15 Sep 2002 13:43:16 -0700 From: Will Andrews To: Maxim Sobolev Cc: Will Andrews , Simon 'corecode' Schubert , Wes Peters , ports@FreeBSD.org, arch@FreeBSD.org Subject: Re: package tools into ports/ (was: Re: Bzipped?) Message-ID: <20020915204316.GT91593@procyon.firepipe.net> Mail-Followup-To: Maxim Sobolev , Will Andrews , Simon 'corecode' Schubert , Wes Peters , ports@FreeBSD.org, arch@FreeBSD.org References: <20020902103215.36ae8e3b.corecode@corecode.ath.cx> <20020902085654.GH2072@procyon.firepipe.net> <3D7445D3.DAA2C9B9@softweyr.com> <20020903100258.068fb3ab.corecode@corecode.ath.cx> <20020903121413.GN2072@procyon.firepipe.net> <20020903130237.GB8010@vega.vega.com> <20020903173228.GO2072@procyon.firepipe.net> <20020903221613.GC9384@vega.vega.com> <20020903221800.GE48750@procyon.firepipe.net> <3D7DFA6B.566DC85B@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3D7DFA6B.566DC85B@FreeBSD.org> User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Sep 10, 2002 at 04:58:03PM +0300, Maxim Sobolev wrote: > 1. Hypothetic pkg_install port should have sources in place, so that > it is available right after cvsup. Of course. > 2. Once installed, the port shouldn't register itself with > /var/db/pkg, becase (a) we don't want zillion versions of > pkg_install-xx.yy.zz cluttering /var/db/pkg and (b) no installed > package should depend on a specific pkg_install package - it's ports > only thing. I'd suggest to separate pkg_install version > checking/updating routine into special target in bsd.port.mk and put > this target even before pre-everything in _FETCH_SEQ. I don't think it is necessary to have this kind of hack. I will be working on this later this week and will post a patch to ports@ for consideration. Can you see about whether the patch you committed can be MFC'd for 4.7? re. -- wca To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Sep 16 12:45:13 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8ED5C37B400; Mon, 16 Sep 2002 12:45:09 -0700 (PDT) Received: from InterJet.dellroad.org (adsl-63-194-81-26.dsl.snfc21.pacbell.net [63.194.81.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id CEF5A43E4A; Mon, 16 Sep 2002 12:45:08 -0700 (PDT) (envelope-from archie@dellroad.org) Received: from arch20m.dellroad.org (arch20m.dellroad.org [10.1.1.20]) by InterJet.dellroad.org (8.9.1a/8.9.1) with ESMTP id MAA09414; Mon, 16 Sep 2002 12:40:51 -0700 (PDT) Received: (from archie@localhost) by arch20m.dellroad.org (8.11.6/8.11.6) id g8GJdlj87372; Mon, 16 Sep 2002 12:39:47 -0700 (PDT) (envelope-from archie) From: Archie Cobbs Message-Id: <200209161939.g8GJdlj87372@arch20m.dellroad.org> Subject: Re: cvs commit: src/sys/kern kern_timeout.c In-Reply-To: <84439.1032204014@critter.freebsd.dk> "from Poul-Henning Kamp at Sep 16, 2002 09:20:14 pm" To: Poul-Henning Kamp Date: Mon, 16 Sep 2002 12:39:47 -0700 (PDT) Cc: freebsd-arch@freebsd.org X-Mailer: ELM [version 2.4ME+ PL88 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp writes: > >Would an option to timeout() like SPAWN_SEPARATE_THREAD be a practical > >solution for some of these cases? I.e., optionally spawn a separate > >thread to handle the timeout() event. > > > >This may be expensive, but there may also be some timeout events that > >are rare, slow and expensive enough themselves to warrant using it. > > I'm not sure, this (or anything else) is the way to go. > > I have wondered if periodic events should be handled differently, > or at least separately from one-shots, but that is also just an > idea. > > I think what we need more than anything, is somebody gathering more > hard data and analyzing more... Certainly true.. and it may be the case that all uses of timeout() in the kernel today are "quick" and non-blocking enough that it's never necessary to spawn a new thread. However, "in general" when you have an event handler situation like this, there is no necessary reason to expect the handlers to all be "quick" functions. So it may be that by expanding the ability of timeout() to handle the more general case, we make it easier to use for other tasks. Just a possibility that would have to be put to the test of course. IMHO, I think there should be a much more generic event handler mechanism in the kernel that supports timeout events, user-defined events, etc., optional thread spawning, and user locking semantics that automatically handles race conditions. This has proven very helpful to me in the past, resulting in much simpler code. I have an example of this kind of API in mind, see the pevent(3) man page in libpdel (ports:devel/libpdel). Follow-ups to freebsd-arch.. -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Sep 17 3:31:16 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id DFA4937B401 for ; Tue, 17 Sep 2002 03:31:14 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6C5D643E4A for ; Tue, 17 Sep 2002 03:31:14 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.5/8.12.5) with ESMTP id g8HAV5wr015790; Tue, 17 Sep 2002 03:31:09 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Message-Id: <200209171031.g8HAV5wr015790@gw.catspoiler.org> Date: Tue, 17 Sep 2002 03:31:05 -0700 (PDT) From: Don Lewis Subject: VOP_INACTIVE() To: arch@FreeBSD.org Cc: Ian Dowse , Terry Lambert MIME-Version: 1.0 Content-Type: TEXT/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG VOP_INACTIVE() is currently called by vput() and vrele() after the vnode reference count has been decremented to zero. This is messy because VOP_INACTIVE() may hang onto a reference to the vnode for an extended period of time while it does cleaup I/O. Judging by the comments in vget() and vclean(), this architectural oddity appears to have adverse consequences in other parts of the code. To work around some of the brokenness, nfs_inactive() temporary bumps the reference count, but unfortunately it calls vrele() recursively to decrement the reference count which results in a panic. A few different fixes have been proposed. The first fix is to set a flag on the vnode to indicate that VOP_INACTIVE() is playing with the vnode, so the vnode should not be reused even though its reference count is zero. The second fix is to all VOP_INACTIVE() before the reference count is decremented. The main problem with this is that some of the things done in the various filesystem inactive methods may depend strongly on the reference count being zero. One example is the call to vrecycle() in ufs_inactive(). A third fix would be to split VOP_INACTIVE() into two parts, one which is called to do any I/O before the reference count is decremented, and the other which does any filesystem specific cleanup after the reference count is decremented. Opinions? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Sep 17 3:38:46 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D1AA337B401 for ; Tue, 17 Sep 2002 03:38:45 -0700 (PDT) Received: from frontend2.aha.ru (bird.zenon.net [213.189.198.215]) by mx1.FreeBSD.org (Postfix) with ESMTP id 99D9143E86 for ; Tue, 17 Sep 2002 03:38:44 -0700 (PDT) (envelope-from uitm@zenon.net) Received: from [195.2.83.132] (HELO backend2.aha.ru) by frontend2.aha.ru (CommuniGate Pro SMTP 4.0b6) with ESMTP id 143910567; Tue, 17 Sep 2002 14:38:43 +0400 Received: from uitm.zenon.net ([195.2.69.86] verified) by backend2.aha.ru (CommuniGate Pro SMTP 4.0b6) with ESMTP id 29854609; Tue, 17 Sep 2002 14:38:43 +0400 From: Andrey Alekseyev Message-Id: <200209171038.g8HAcgQ39991@uitm.zenon.net> Subject: Re: VOP_INACTIVE() In-Reply-To: <200209171031.g8HAV5wr015790@gw.catspoiler.org> from Don Lewis at "Sep 17, 2002 03:31:05 am" To: Don Lewis Date: Tue, 17 Sep 2002 14:38:42 +0400 (MSD) Cc: arch@FreeBSD.org, Ian Dowse , Terry Lambert X-Mailer: ELM [version 2.4ME+ PL61 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > The second fix is to all VOP_INACTIVE() before the reference count is > decremented. The main problem with this is that some of the things done Btw, Solaris does that in vn_rele() -- Andrey Alekseyev. Zenon N.S.P. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Sep 17 7:28:48 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7A98937B401 for ; Tue, 17 Sep 2002 07:28:46 -0700 (PDT) Received: from albatross.prod.itd.earthlink.net (albatross.mail.pas.earthlink.net [207.217.120.120]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1598243E3B for ; Tue, 17 Sep 2002 07:28:46 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0243.cvx22-bradley.dialup.earthlink.net ([209.179.198.243] helo=mindspring.com) by albatross.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 17rJLC-0004BN-00; Tue, 17 Sep 2002 07:28:38 -0700 Message-ID: <3D873BD7.35610322@mindspring.com> Date: Tue, 17 Sep 2002 07:27:35 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Don Lewis Cc: arch@FreeBSD.org, Ian Dowse Subject: Re: VOP_INACTIVE() References: <200209171031.g8HAV5wr015790@gw.catspoiler.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Don Lewis wrote: > The first fix is to set a flag on the vnode to indicate that > VOP_INACTIVE() is playing with the vnode, so the vnode should not be > reused even though its reference count is zero. This is what used to happen; the flag was V_LOCK (or something like that). There were a couple of instances of inversion in its use, and it was hard to keep straight. > The second fix is to all VOP_INACTIVE() before the reference count is > decremented. The main problem with this is that some of the things done > in the various filesystem inactive methods may depend strongly on the > reference count being zero. One example is the call to vrecycle() in > ufs_inactive(). > > A third fix would be to split VOP_INACTIVE() into two parts, one which > is called to do any I/O before the reference count is decremented, and > the other which does any filesystem specific cleanup after the reference > count is decremented. > > Opinions? I do not like the third option. The logical thing to do would be to not assume anything about the reference count, and that VOP_INACTIVE() *means* VOP_INACTIVE(). This is a modification of the second option. Obviously, you do not want to drop the reference in the higher level code, until you are certain the the reference has been dropped by the lower level code. The real issue here is that the VOP_INACTIVE() is a "give the vnode back to the system" operation, and gives proxy ownership of the vnode the the VFS. This is right. What's wrong is that the vnode that comes out does not have a reference which is owned by the VFS. If this is corrected, you can call VOP_INACTIVE(), and expect that the reference count will be decremented, and that ownership of the vnode will *not* necessarily pass back to the system. The way this works is to set up a reference count management via procedure call, making this a procedural interface. This would be a system function (e.g.: "grab a vnode, get a vnode with a reference count of one back, dispose a vnode of reference count 1, and have the 1->0 transition free the vnode back to the system). For this to really work, open instances need to be references... and directory cache instances *also* need t be references. This is easier, if you think of vnode ownership being with the file system, instead of with the OS, and that the OS-wide pool is really nothing more than a memory manager implementation detail for the vnode structure allocation and free. The TFS vnode references are a good example for this, if they are still around, because in the TFS code, the vnode *was* owned and managed by the VFS (the same would be true for any vnodes for binary SVR4 or other VFS modules that were run in a FreeBSD environment, FWIW). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Sep 17 15:31:37 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 15D8C37B401 for ; Tue, 17 Sep 2002 15:31:37 -0700 (PDT) Received: from gw.catspoiler.org (217-ip-163.nccn.net [209.79.217.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 59E5C43E81 for ; Tue, 17 Sep 2002 15:31:36 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Received: from mousie.catspoiler.org (mousie.catspoiler.org [192.168.101.2]) by gw.catspoiler.org (8.12.5/8.12.5) with ESMTP id g8HMVLwr017242; Tue, 17 Sep 2002 15:31:26 -0700 (PDT) (envelope-from dl-freebsd@catspoiler.org) Message-Id: <200209172231.g8HMVLwr017242@gw.catspoiler.org> Date: Tue, 17 Sep 2002 15:31:21 -0700 (PDT) From: Don Lewis Subject: Re: VOP_INACTIVE() To: uitm@zenon.net Cc: dl-freebsd@catspoiler.org, arch@FreeBSD.ORG, iedowse@maths.tcd.ie, tlambert2@mindspring.com In-Reply-To: <200209171038.g8HAcgQ39991@uitm.zenon.net> MIME-Version: 1.0 Content-Type: TEXT/plain; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 17 Sep, Andrey Alekseyev wrote: >> The second fix is to all VOP_INACTIVE() before the reference count is >> decremented. The main problem with this is that some of the things done > > Btw, Solaris does that in vn_rele() I took a peek at NetBSD and they go to the opposite extreme. Vrele() and vrele() call VOP_INACTIVE() after the vnode has been put on the free list. It looks like they rely on the vnode lock being held to prevent bad stuff from happening. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 6:36: 4 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C088937B401 for ; Wed, 18 Sep 2002 06:36:03 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id A33DC43E81 for ; Wed, 18 Sep 2002 06:36:02 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IDa1tF001970 for ; Wed, 18 Sep 2002 15:36:01 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: arch@freebsd.org Subject: What is "LIBMCHAIN" and why is it in the tree ? From: Poul-Henning Kamp Date: Wed, 18 Sep 2002 15:36:01 +0200 Message-ID: <1969.1032356161@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 7:33:37 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3267437B401; Wed, 18 Sep 2002 07:33:37 -0700 (PDT) Received: from salmon.maths.tcd.ie (salmon.maths.tcd.ie [134.226.81.11]) by mx1.FreeBSD.org (Postfix) with SMTP id 8B51F43E6E; Wed, 18 Sep 2002 07:33:34 -0700 (PDT) (envelope-from iedowse@maths.tcd.ie) Received: from walton.maths.tcd.ie by salmon.maths.tcd.ie with SMTP id ; 18 Sep 2002 15:33:32 +0100 (BST) To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: What is "LIBMCHAIN" and why is it in the tree ? In-Reply-To: Your message of "Wed, 18 Sep 2002 15:36:01 +0200." <1969.1032356161@critter.freebsd.dk> Date: Wed, 18 Sep 2002 15:33:32 +0100 From: Ian Dowse Message-ID: <200209181533.aa71116@salmon.maths.tcd.ie> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <1969.1032356161@critter.freebsd.dk>, Poul-Henning Kamp writes: > What is "LIBMCHAIN" and why is it in the tree ? As the cvs history will tell you, it is a set of routines for building and parsing mbuf chains. It is useful for processing requests and replies in vaguely RPC-like protocols, and it's in the tree because nwfs and smbfs use it (I think it was made an optional component to avoid the small extra bloat in kernels that don't use these). The NFS code could probably also benefit from using it. Ian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 8: 7:20 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7AA6437B401 for ; Wed, 18 Sep 2002 08:07:13 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8827D43E6A for ; Wed, 18 Sep 2002 08:07:12 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IF7AtF003186 for ; Wed, 18 Sep 2002 17:07:11 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: arch@freebsd.org Subject: Trivial mbuf patch for review. From: Poul-Henning Kamp Date: Wed, 18 Sep 2002 17:07:10 +0200 Message-ID: <3185.1032361630@critter.freebsd.dk> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG This patch is a no-op which replaces local mbuf-chain counting loops with calls to m_length() and in one case m_fixhdr(). Index: kern/uipc_socket2.c =================================================================== RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v retrieving revision 1.103 diff -u -r1.103 uipc_socket2.c --- kern/uipc_socket2.c 16 Aug 2002 18:41:48 -0000 1.103 +++ kern/uipc_socket2.c 18 Sep 2002 14:08:34 -0000 @@ -498,11 +498,11 @@ #ifdef SOCKBUF_DEBUG void sbcheck(sb) - register struct sockbuf *sb; + struct sockbuf *sb; { - register struct mbuf *m; - register struct mbuf *n = 0; - register u_long len = 0, mbcnt = 0; + struct mbuf *m; + struct mbuf *n = 0; + u_long len = 0, mbcnt = 0; for (m = sb->sb_mb; m; m = n) { n = m->m_nextpkt; @@ -610,22 +610,18 @@ */ int sbappendaddr(sb, asa, m0, control) - register struct sockbuf *sb; + struct sockbuf *sb; struct sockaddr *asa; struct mbuf *m0, *control; { - register struct mbuf *m, *n; + struct mbuf *m, *n; int space = asa->sa_len; if (m0 && (m0->m_flags & M_PKTHDR) == 0) panic("sbappendaddr"); if (m0) space += m0->m_pkthdr.len; - for (n = control; n; n = n->m_next) { - space += n->m_len; - if (n->m_next == 0) /* keep pointer to last control buf */ - break; - } + space += m_length(control, &n); if (space > sbspace(sb)) return (0); if (asa->sa_len > MLEN) @@ -657,19 +653,12 @@ struct sockbuf *sb; struct mbuf *control, *m0; { - register struct mbuf *m, *n; - int space = 0; + struct mbuf *m, *n; + int space; if (control == 0) panic("sbappendcontrol"); - for (m = control; ; m = m->m_next) { - space += m->m_len; - if (m->m_next == 0) - break; - } - n = m; /* save pointer to last control buffer */ - for (m = m0; m; m = m->m_next) - space += m->m_len; + space = m_length(control, &n) + m_length(m0, NULL); if (space > sbspace(sb)) return (0); n->m_next = m0; /* concatenate data to control */ Index: net/bpf.c =================================================================== RCS file: /home/ncvs/src/sys/net/bpf.c,v retrieving revision 1.94 diff -u -r1.94 bpf.c --- net/bpf.c 31 Jul 2002 16:11:32 -0000 1.94 +++ net/bpf.c 18 Sep 2002 14:18:31 -0000 @@ -1123,11 +1123,8 @@ struct bpf_if *bp = ifp->if_bpf; struct bpf_d *d; u_int pktlen, slen; - struct mbuf *m0; - pktlen = 0; - for (m0 = m; m0 != 0; m0 = m0->m_next) - pktlen += m0->m_len; + pktlen = m_length(m, NULL); BPFIF_LOCK(bp); for (d = bp->bif_dlist; d != 0; d = d->bd_next) { Index: net/if_ppp.c =================================================================== RCS file: /home/ncvs/src/sys/net/if_ppp.c,v retrieving revision 1.83 diff -u -r1.83 if_ppp.c --- net/if_ppp.c 19 Aug 2002 19:22:41 -0000 1.83 +++ net/if_ppp.c 18 Sep 2002 14:19:07 -0000 @@ -758,7 +758,6 @@ struct ifqueue *ifq; enum NPmode mode; int len; - struct mbuf *m; #ifdef MAC error = mac_check_ifnet_transmit(ifp, m0); @@ -851,9 +850,7 @@ *cp++ = protocol & 0xff; m0->m_len += PPP_HDRLEN; - len = 0; - for (m = m0; m != 0; m = m->m_next) - len += m->m_len; + len = m_length(m0, NULL); if (sc->sc_flags & SC_LOG_OUTPKT) { printf("ppp%d output: ", ifp->if_unit); @@ -1087,9 +1084,7 @@ struct mbuf *mcomp = NULL; int slen, clen; - slen = 0; - for (mp = m; mp != NULL; mp = mp->m_next) - slen += mp->m_len; + slen = m_length(m, NULL); clen = (*sc->sc_xcomp->compress) (sc->sc_xc_state, &mcomp, m, slen, sc->sc_if.if_mtu + PPP_HDRLEN); if (mcomp != NULL) { @@ -1324,9 +1319,7 @@ sc->sc_stats.ppp_ipackets++; if (sc->sc_flags & SC_LOG_INPKT) { - ilen = 0; - for (mp = m; mp != NULL; mp = mp->m_next) - ilen += mp->m_len; + ilen = m_length(m, NULL); printf("ppp%d: got %d bytes\n", ifp->if_unit, ilen); pppdumpm(m); } @@ -1389,9 +1382,7 @@ } #endif - ilen = 0; - for (mp = m; mp != NULL; mp = mp->m_next) - ilen += mp->m_len; + ilen = m_length(m, NULL); #ifdef VJC if (sc->sc_flags & SC_VJ_RESET) { Index: netinet/ip_input.c =================================================================== RCS file: /home/ncvs/src/sys/netinet/ip_input.c,v retrieving revision 1.208 diff -u -r1.208 ip_input.c --- netinet/ip_input.c 17 Sep 2002 11:20:02 -0000 1.208 +++ netinet/ip_input.c 18 Sep 2002 13:47:10 -0000 @@ -1071,12 +1071,8 @@ m->m_len += (IP_VHL_HL(ip->ip_vhl) << 2); m->m_data -= (IP_VHL_HL(ip->ip_vhl) << 2); /* some debugging cruft by sklower, below, will go away soon */ - if (m->m_flags & M_PKTHDR) { /* XXX this should be done elsewhere */ - register int plen = 0; - for (t = m; t; t = t->m_next) - plen += t->m_len; - m->m_pkthdr.len = plen; - } + if (m->m_flags & M_PKTHDR) /* XXX this should be done elsewhere */ + m_fixhdr(m); return (m); dropfrag: Index: netns/idp_usrreq.c =================================================================== RCS file: /home/ncvs/src/sys/netns/idp_usrreq.c,v retrieving revision 1.12 diff -u -r1.12 idp_usrreq.c --- netns/idp_usrreq.c 31 May 2002 11:52:34 -0000 1.12 +++ netns/idp_usrreq.c 18 Sep 2002 14:14:06 -0000 @@ -144,18 +144,12 @@ register struct mbuf *m; register struct idp *idp; register struct socket *so; - register int len = 0; + register int len; register struct route *ro; struct mbuf *mprev; extern int idpcksum; - /* - * Calculate data length. - */ - for (m = m0; m; m = m->m_next) { - mprev = m; - len += m->m_len; - } + len = m_length(m0, &mprev); /* * Make sure packet is actually of even length. */ Index: netns/spp_usrreq.c =================================================================== RCS file: /home/ncvs/src/sys/netns/spp_usrreq.c,v retrieving revision 1.16 diff -u -r1.16 spp_usrreq.c --- netns/spp_usrreq.c 25 Aug 2002 13:17:35 -0000 1.16 +++ netns/spp_usrreq.c 18 Sep 2002 14:13:27 -0000 @@ -687,8 +687,7 @@ firstbad = m; /*for (;;) {*/ /* calculate length */ - for (m0 = m, len = 0; m ; m = m->m_next) - len += m->m_len; + len = m_length(m); if (len > cb->s_mtu) { } /* FINISH THIS Index: netsmb/smb_rq.c =================================================================== RCS file: /home/ncvs/src/sys/netsmb/smb_rq.c,v retrieving revision 1.7 diff -u -r1.7 smb_rq.c --- netsmb/smb_rq.c 16 Sep 2002 09:51:58 -0000 1.7 +++ netsmb/smb_rq.c 18 Sep 2002 14:12:50 -0000 @@ -421,9 +421,7 @@ m0 = m_split(mtop, offset, M_TRYWAIT); if (m0 == NULL) return EBADRPC; - for(len = 0, m = m0; m->m_next; m = m->m_next) - len += m->m_len; - len += m->m_len; + len = m_length(m0, &m); m->m_len -= len - count; if (mdp->md_top == NULL) { md_initm(mdp, m0); Index: nfsclient/nfs_socket.c =================================================================== RCS file: /home/ncvs/src/sys/nfsclient/nfs_socket.c,v retrieving revision 1.86 diff -u -r1.86 nfs_socket.c --- nfsclient/nfs_socket.c 8 Sep 2002 15:11:18 -0000 1.86 +++ nfsclient/nfs_socket.c 18 Sep 2002 14:19:42 -0000 @@ -869,13 +869,7 @@ rep->r_vp = vp; rep->r_td = td; rep->r_procnum = procnum; - i = 0; - m = mrest; - while (m) { - i += m->m_len; - m = m->m_next; - } - mrest_len = i; + mrest_len = i = m_length(mrest, NULL); /* * Get the RPC header with authorization. Index: nfsserver/nfs_syscalls.c =================================================================== RCS file: /home/ncvs/src/sys/nfsserver/nfs_syscalls.c,v retrieving revision 1.80 diff -u -r1.80 nfs_syscalls.c --- nfsserver/nfs_syscalls.c 24 Jul 2002 23:10:34 -0000 1.80 +++ nfsserver/nfs_syscalls.c 18 Sep 2002 14:11:36 -0000 @@ -451,12 +451,7 @@ nfsrv_updatecache(nd, TRUE, mreq); nd->nd_mrep = NULL; case RC_REPLY: - m = mreq; - siz = 0; - while (m) { - siz += m->m_len; - m = m->m_next; - } + siz = m_length(mreq, NULL); if (siz <= 0 || siz > NFS_MAXPACKET) { printf("mbuf siz=%d\n",siz); panic("Bad nfs svc reply"); -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 8:27: 5 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 355BC37B401; Wed, 18 Sep 2002 08:27:04 -0700 (PDT) Received: from tesla.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8BA1843E65; Wed, 18 Sep 2002 08:27:03 -0700 (PDT) (envelope-from bmilekic@unixdaemons.com) Received: (from bmilekic@localhost) by tesla.distributel.net (8.11.6/8.11.6) id g8IFSEg33907; Wed, 18 Sep 2002 11:28:14 -0400 (EDT) (envelope-from bmilekic@unixdaemons.com) Date: Wed, 18 Sep 2002 11:28:14 -0400 From: Bosko Milekic To: Ian Dowse Cc: Poul-Henning Kamp , arch@freebsd.org Subject: Re: What is "LIBMCHAIN" and why is it in the tree ? Message-ID: <20020918112814.B33836@unixdaemons.com> References: <1969.1032356161@critter.freebsd.dk> <200209181533.aa71116@salmon.maths.tcd.ie> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <200209181533.aa71116@salmon.maths.tcd.ie>; from iedowse@maths.tcd.ie on Wed, Sep 18, 2002 at 03:33:32PM +0100 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Sep 18, 2002 at 03:33:32PM +0100, Ian Dowse wrote: > In message <1969.1032356161@critter.freebsd.dk>, Poul-Henning Kamp writes: > > What is "LIBMCHAIN" and why is it in the tree ? > > As the cvs history will tell you, it is a set of routines for > building and parsing mbuf chains. It is useful for processing > requests and replies in vaguely RPC-like protocols, and it's in the > tree because nwfs and smbfs use it (I think it was made an optional > component to avoid the small extra bloat in kernels that don't use > these). The NFS code could probably also benefit from using it. > > Ian libmchain was developed by Boris mainly for his smb et al. code. It is pretty useful but could use some eventual optimisation and, more importantly, a larger audience. Please don't axe this, thanks. -- Bosko Milekic * bmilekic@unixdaemons.com * bmilekic@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 8:30: 3 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CBD0E37B401 for ; Wed, 18 Sep 2002 08:30:01 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id E880643E3B for ; Wed, 18 Sep 2002 08:30:00 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IFTwtF003459; Wed, 18 Sep 2002 17:29:59 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Bosko Milekic Cc: Ian Dowse , arch@freebsd.org Subject: Re: What is "LIBMCHAIN" and why is it in the tree ? In-Reply-To: Your message of "Wed, 18 Sep 2002 11:28:14 EDT." <20020918112814.B33836@unixdaemons.com> Date: Wed, 18 Sep 2002 17:29:58 +0200 Message-ID: <3458.1032362998@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020918112814.B33836@unixdaemons.com>, Bosko Milekic writes: > > >On Wed, Sep 18, 2002 at 03:33:32PM +0100, Ian Dowse wrote: >> In message <1969.1032356161@critter.freebsd.dk>, Poul-Henning Kamp writes: >> > What is "LIBMCHAIN" and why is it in the tree ? >> >> As the cvs history will tell you, it is a set of routines for >> building and parsing mbuf chains. It is useful for processing >> requests and replies in vaguely RPC-like protocols, and it's in the >> tree because nwfs and smbfs use it (I think it was made an optional >> component to avoid the small extra bloat in kernels that don't use >> these). The NFS code could probably also benefit from using it. >> >> Ian > > libmchain was developed by Boris mainly for his smb et al. code. It > is pretty useful but could use some eventual optimisation and, more > importantly, a larger audience. > > Please don't axe this, thanks. Nobody's talking about axing, I was just surprised to see that the only reference to the LIBMCHAIN which enables the file was from NOTES. If it is used by nwfs and smbfs, shouldn't files contain these lines ? kern/subr_mchain.c optional smbfs kern/subr_mchain.c optional nwfs -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 10:20:10 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C284D37B404; Wed, 18 Sep 2002 10:20:09 -0700 (PDT) Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id 23F8543E4A; Wed, 18 Sep 2002 10:20:09 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc02.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020918172008.CHCN7124.sccrmhc02.attbi.com@InterJet.elischer.org>; Wed, 18 Sep 2002 17:20:08 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id KAA07608; Wed, 18 Sep 2002 10:18:29 -0700 (PDT) Date: Wed, 18 Sep 2002 10:18:28 -0700 (PDT) From: Julian Elischer To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: Trivial mbuf patch for review. In-Reply-To: <3185.1032361630@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 18 Sep 2002, Poul-Henning Kamp wrote: a fair idea.. I think m_length could be a maco or an inline.... It's hardly worth a function call.. > - for (n = control; n; n = n->m_next) { > - space += n->m_len; > - if (n->m_next == 0) /* keep pointer to last control buf */ > - break; > - } > - len = 0; > - for (m = m0; m != 0; m = m->m_next) > - len += m->m_len; > + len = m_length(m0, NULL); To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 10:21:52 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9931737B401; Wed, 18 Sep 2002 10:21:45 -0700 (PDT) Received: from tesla.distributel.net (nat.MTL.distributel.NET [66.38.181.24]) by mx1.FreeBSD.org (Postfix) with ESMTP id D500A43E6E; Wed, 18 Sep 2002 10:21:44 -0700 (PDT) (envelope-from bmilekic@unixdaemons.com) Received: (from bmilekic@localhost) by tesla.distributel.net (8.11.6/8.11.6) id g8IHN0734107; Wed, 18 Sep 2002 13:23:00 -0400 (EDT) (envelope-from bmilekic@unixdaemons.com) Date: Wed, 18 Sep 2002 13:23:00 -0400 From: Bosko Milekic To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: Trivial mbuf patch for review. Message-ID: <20020918132300.A34069@unixdaemons.com> References: <3185.1032361630@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <3185.1032361630@critter.freebsd.dk>; from phk@freebsd.org on Wed, Sep 18, 2002 at 05:07:10PM +0200 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Sep 18, 2002 at 05:07:10PM +0200, Poul-Henning Kamp wrote: > > This patch is a no-op which replaces local mbuf-chain counting > loops with calls to m_length() and in one case m_fixhdr(). > > Index: kern/uipc_socket2.c > =================================================================== > RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v > retrieving revision 1.103 > diff -u -r1.103 uipc_socket2.c > --- kern/uipc_socket2.c 16 Aug 2002 18:41:48 -0000 1.103 > +++ kern/uipc_socket2.c 18 Sep 2002 14:08:34 -0000 > @@ -498,11 +498,11 @@ > #ifdef SOCKBUF_DEBUG > void > sbcheck(sb) > - register struct sockbuf *sb; > + struct sockbuf *sb; > { > - register struct mbuf *m; > - register struct mbuf *n = 0; > - register u_long len = 0, mbcnt = 0; > + struct mbuf *m; > + struct mbuf *n = 0; > + u_long len = 0, mbcnt = 0; > > for (m = sb->sb_mb; m; m = n) { > n = m->m_nextpkt; > @@ -610,22 +610,18 @@ > */ > int > sbappendaddr(sb, asa, m0, control) > - register struct sockbuf *sb; > + struct sockbuf *sb; > struct sockaddr *asa; > struct mbuf *m0, *control; > { > - register struct mbuf *m, *n; > + struct mbuf *m, *n; > int space = asa->sa_len; > > if (m0 && (m0->m_flags & M_PKTHDR) == 0) > panic("sbappendaddr"); > if (m0) > space += m0->m_pkthdr.len; > - for (n = control; n; n = n->m_next) { > - space += n->m_len; > - if (n->m_next == 0) /* keep pointer to last control buf */ > - break; > - } > + space += m_length(control, &n); > if (space > sbspace(sb)) > return (0); > if (asa->sa_len > MLEN) > @@ -657,19 +653,12 @@ > struct sockbuf *sb; > struct mbuf *control, *m0; > { > - register struct mbuf *m, *n; > - int space = 0; > + struct mbuf *m, *n; > + int space; > > if (control == 0) > panic("sbappendcontrol"); > - for (m = control; ; m = m->m_next) { > - space += m->m_len; > - if (m->m_next == 0) > - break; > - } > - n = m; /* save pointer to last control buffer */ > - for (m = m0; m; m = m->m_next) > - space += m->m_len; > + space = m_length(control, &n) + m_length(m0, NULL); > if (space > sbspace(sb)) > return (0); > n->m_next = m0; /* concatenate data to control */ Looks like there is a problem here, as you removed the 'n = m' initialization. > Index: net/bpf.c > =================================================================== > RCS file: /home/ncvs/src/sys/net/bpf.c,v > retrieving revision 1.94 > diff -u -r1.94 bpf.c > --- net/bpf.c 31 Jul 2002 16:11:32 -0000 1.94 > +++ net/bpf.c 18 Sep 2002 14:18:31 -0000 > @@ -1123,11 +1123,8 @@ > struct bpf_if *bp = ifp->if_bpf; > struct bpf_d *d; > u_int pktlen, slen; > - struct mbuf *m0; > > - pktlen = 0; > - for (m0 = m; m0 != 0; m0 = m0->m_next) > - pktlen += m0->m_len; > + pktlen = m_length(m, NULL); > > BPFIF_LOCK(bp); > for (d = bp->bif_dlist; d != 0; d = d->bd_next) { > Index: net/if_ppp.c > =================================================================== > RCS file: /home/ncvs/src/sys/net/if_ppp.c,v > retrieving revision 1.83 > diff -u -r1.83 if_ppp.c > --- net/if_ppp.c 19 Aug 2002 19:22:41 -0000 1.83 > +++ net/if_ppp.c 18 Sep 2002 14:19:07 -0000 > @@ -758,7 +758,6 @@ > struct ifqueue *ifq; > enum NPmode mode; > int len; > - struct mbuf *m; > > #ifdef MAC > error = mac_check_ifnet_transmit(ifp, m0); > @@ -851,9 +850,7 @@ > *cp++ = protocol & 0xff; > m0->m_len += PPP_HDRLEN; > > - len = 0; > - for (m = m0; m != 0; m = m->m_next) > - len += m->m_len; > + len = m_length(m0, NULL); > > if (sc->sc_flags & SC_LOG_OUTPKT) { > printf("ppp%d output: ", ifp->if_unit); > @@ -1087,9 +1084,7 @@ > struct mbuf *mcomp = NULL; > int slen, clen; > > - slen = 0; > - for (mp = m; mp != NULL; mp = mp->m_next) > - slen += mp->m_len; > + slen = m_length(m, NULL); > clen = (*sc->sc_xcomp->compress) > (sc->sc_xc_state, &mcomp, m, slen, sc->sc_if.if_mtu + PPP_HDRLEN); > if (mcomp != NULL) { > @@ -1324,9 +1319,7 @@ > sc->sc_stats.ppp_ipackets++; > > if (sc->sc_flags & SC_LOG_INPKT) { > - ilen = 0; > - for (mp = m; mp != NULL; mp = mp->m_next) > - ilen += mp->m_len; > + ilen = m_length(m, NULL); > printf("ppp%d: got %d bytes\n", ifp->if_unit, ilen); > pppdumpm(m); > } > @@ -1389,9 +1382,7 @@ > } > #endif > > - ilen = 0; > - for (mp = m; mp != NULL; mp = mp->m_next) > - ilen += mp->m_len; > + ilen = m_length(m, NULL); > > #ifdef VJC > if (sc->sc_flags & SC_VJ_RESET) { > Index: netinet/ip_input.c > =================================================================== > RCS file: /home/ncvs/src/sys/netinet/ip_input.c,v > retrieving revision 1.208 > diff -u -r1.208 ip_input.c > --- netinet/ip_input.c 17 Sep 2002 11:20:02 -0000 1.208 > +++ netinet/ip_input.c 18 Sep 2002 13:47:10 -0000 > @@ -1071,12 +1071,8 @@ > m->m_len += (IP_VHL_HL(ip->ip_vhl) << 2); > m->m_data -= (IP_VHL_HL(ip->ip_vhl) << 2); > /* some debugging cruft by sklower, below, will go away soon */ > - if (m->m_flags & M_PKTHDR) { /* XXX this should be done elsewhere */ > - register int plen = 0; > - for (t = m; t; t = t->m_next) > - plen += t->m_len; > - m->m_pkthdr.len = plen; > - } > + if (m->m_flags & M_PKTHDR) /* XXX this should be done elsewhere */ > + m_fixhdr(m); > return (m); > > dropfrag: > Index: netns/idp_usrreq.c > =================================================================== > RCS file: /home/ncvs/src/sys/netns/idp_usrreq.c,v > retrieving revision 1.12 > diff -u -r1.12 idp_usrreq.c > --- netns/idp_usrreq.c 31 May 2002 11:52:34 -0000 1.12 > +++ netns/idp_usrreq.c 18 Sep 2002 14:14:06 -0000 > @@ -144,18 +144,12 @@ > register struct mbuf *m; > register struct idp *idp; > register struct socket *so; > - register int len = 0; > + register int len; > register struct route *ro; > struct mbuf *mprev; > extern int idpcksum; > > - /* > - * Calculate data length. > - */ > - for (m = m0; m; m = m->m_next) { > - mprev = m; > - len += m->m_len; > - } > + len = m_length(m0, &mprev); > /* > * Make sure packet is actually of even length. > */ > Index: netns/spp_usrreq.c > =================================================================== > RCS file: /home/ncvs/src/sys/netns/spp_usrreq.c,v > retrieving revision 1.16 > diff -u -r1.16 spp_usrreq.c > --- netns/spp_usrreq.c 25 Aug 2002 13:17:35 -0000 1.16 > +++ netns/spp_usrreq.c 18 Sep 2002 14:13:27 -0000 > @@ -687,8 +687,7 @@ > firstbad = m; > /*for (;;) {*/ > /* calculate length */ > - for (m0 = m, len = 0; m ; m = m->m_next) > - len += m->m_len; > + len = m_length(m); > if (len > cb->s_mtu) { > } > /* FINISH THIS > Index: netsmb/smb_rq.c > =================================================================== > RCS file: /home/ncvs/src/sys/netsmb/smb_rq.c,v > retrieving revision 1.7 > diff -u -r1.7 smb_rq.c > --- netsmb/smb_rq.c 16 Sep 2002 09:51:58 -0000 1.7 > +++ netsmb/smb_rq.c 18 Sep 2002 14:12:50 -0000 > @@ -421,9 +421,7 @@ > m0 = m_split(mtop, offset, M_TRYWAIT); > if (m0 == NULL) > return EBADRPC; > - for(len = 0, m = m0; m->m_next; m = m->m_next) > - len += m->m_len; > - len += m->m_len; > + len = m_length(m0, &m); > m->m_len -= len - count; > if (mdp->md_top == NULL) { > md_initm(mdp, m0); > Index: nfsclient/nfs_socket.c > =================================================================== > RCS file: /home/ncvs/src/sys/nfsclient/nfs_socket.c,v > retrieving revision 1.86 > diff -u -r1.86 nfs_socket.c > --- nfsclient/nfs_socket.c 8 Sep 2002 15:11:18 -0000 1.86 > +++ nfsclient/nfs_socket.c 18 Sep 2002 14:19:42 -0000 > @@ -869,13 +869,7 @@ > rep->r_vp = vp; > rep->r_td = td; > rep->r_procnum = procnum; > - i = 0; > - m = mrest; > - while (m) { > - i += m->m_len; > - m = m->m_next; > - } > - mrest_len = i; > + mrest_len = i = m_length(mrest, NULL); > > /* > * Get the RPC header with authorization. > Index: nfsserver/nfs_syscalls.c > =================================================================== > RCS file: /home/ncvs/src/sys/nfsserver/nfs_syscalls.c,v > retrieving revision 1.80 > diff -u -r1.80 nfs_syscalls.c > --- nfsserver/nfs_syscalls.c 24 Jul 2002 23:10:34 -0000 1.80 > +++ nfsserver/nfs_syscalls.c 18 Sep 2002 14:11:36 -0000 > @@ -451,12 +451,7 @@ > nfsrv_updatecache(nd, TRUE, mreq); > nd->nd_mrep = NULL; > case RC_REPLY: > - m = mreq; > - siz = 0; > - while (m) { > - siz += m->m_len; > - m = m->m_next; > - } > + siz = m_length(mreq, NULL); > if (siz <= 0 || siz > NFS_MAXPACKET) { > printf("mbuf siz=%d\n",siz); > panic("Bad nfs svc reply"); > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. The rest looks OK, from a very quick glance. By the way, it's good to see that you're doing some cleanups in this code (re: m_length() implementation and m_fixhdr() movements). Thank you. Regards, -- Bosko Milekic * bmilekic@unixdaemons.com * bmilekic@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 10:30:46 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4CFE337B401 for ; Wed, 18 Sep 2002 10:30:45 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 48E8143E6A for ; Wed, 18 Sep 2002 10:30:44 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IHUgtF004749; Wed, 18 Sep 2002 19:30:42 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Bosko Milekic Cc: arch@freebsd.org Subject: Re: Trivial mbuf patch for review. In-Reply-To: Your message of "Wed, 18 Sep 2002 13:23:00 EDT." <20020918132300.A34069@unixdaemons.com> Date: Wed, 18 Sep 2002 19:30:42 +0200 Message-ID: <4748.1032370242@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020918132300.A34069@unixdaemons.com>, Bosko Milekic writes: >> panic("sbappendcontrol"); >> - for (m = control; ; m = m->m_next) { >> - space += m->m_len; >> - if (m->m_next == 0) >> - break; >> - } >> - n = m; /* save pointer to last control buffer */ >> - for (m = m0; m; m = m->m_next) >> - space += m->m_len; >> + space = m_length(control, &n) + m_length(m0, NULL); >> if (space > sbspace(sb)) >> return (0); >> n->m_next = m0; /* concatenate data to control */ > > Looks like there is a problem here, as you removed the 'n = m' > initialization. m_length's returns the pointer to the last mbuf in the chain in the second argument, so n should have the same value as before. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 11:26:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A6EFA37B406 for ; Wed, 18 Sep 2002 11:26:55 -0700 (PDT) Received: from rootlabs.com (root.org [67.118.192.226]) by mx1.FreeBSD.org (Postfix) with SMTP id BBA0843E6E for ; Wed, 18 Sep 2002 11:26:50 -0700 (PDT) (envelope-from nate@rootlabs.com) Received: (qmail 44643 invoked by uid 1000); 18 Sep 2002 18:26:51 -0000 Date: Wed, 18 Sep 2002 11:26:51 -0700 (PDT) From: Nate Lawson To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: Re: Trivial mbuf patch for review. In-Reply-To: <3185.1032361630@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 18 Sep 2002, Poul-Henning Kamp wrote: > This patch is a no-op which replaces local mbuf-chain counting > loops with calls to m_length() and in one case m_fixhdr(). > > Index: kern/uipc_socket2.c > =================================================================== > RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v > retrieving revision 1.103 > diff -u -r1.103 uipc_socket2.c > --- kern/uipc_socket2.c 16 Aug 2002 18:41:48 -0000 1.103 > +++ kern/uipc_socket2.c 18 Sep 2002 14:08:34 -0000 > @@ -498,11 +498,11 @@ > #ifdef SOCKBUF_DEBUG > void > sbcheck(sb) > - register struct sockbuf *sb; > + struct sockbuf *sb; > { > - register struct mbuf *m; > - register struct mbuf *n = 0; > - register u_long len = 0, mbcnt = 0; > + struct mbuf *m; > + struct mbuf *n = 0; > + u_long len = 0, mbcnt = 0; > > for (m = sb->sb_mb; m; m = n) { > n = m->m_nextpkt; Have we agreed to remove "register" from all our code or did you have a specific reason for doing this here? Other places in the same patch you leave register in after changing the line. > Index: nfsclient/nfs_socket.c > =================================================================== > RCS file: /home/ncvs/src/sys/nfsclient/nfs_socket.c,v > retrieving revision 1.86 > diff -u -r1.86 nfs_socket.c > --- nfsclient/nfs_socket.c 8 Sep 2002 15:11:18 -0000 1.86 > +++ nfsclient/nfs_socket.c 18 Sep 2002 14:19:42 -0000 > @@ -869,13 +869,7 @@ > rep->r_vp = vp; > rep->r_td = td; > rep->r_procnum = procnum; > - i = 0; > - m = mrest; > - while (m) { > - i += m->m_len; > - m = m->m_next; > - } > - mrest_len = i; > + mrest_len = i = m_length(mrest, NULL); Is this initialization accepted style? Overall looks good and it's great to remove the redundancy. -Nate To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 11:31:25 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 36D9C37B401; Wed, 18 Sep 2002 11:31:18 -0700 (PDT) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id DDCBD43E81; Wed, 18 Sep 2002 11:31:15 -0700 (PDT) (envelope-from arr@watson.org) Received: from fledge.watson.org (localhost [127.0.0.1]) by fledge.watson.org (8.12.4/8.12.4) with ESMTP id g8IIUkOn053073; Wed, 18 Sep 2002 14:30:46 -0400 (EDT) (envelope-from arr@watson.org) Received: from localhost (arr@localhost) by fledge.watson.org (8.12.5/8.12.5/Submit) with SMTP id g8IIUjJQ053070; Wed, 18 Sep 2002 14:30:46 -0400 (EDT) X-Authentication-Warning: fledge.watson.org: arr owned process doing -bs Date: Wed, 18 Sep 2002 14:30:44 -0400 (EDT) From: "Andrew R. Reiter" To: Poul-Henning Kamp Cc: arch@FreeBSD.ORG Subject: Re: Trivial mbuf patch for review. In-Reply-To: <3185.1032361630@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG The only comment I have is the same one I mentioned on irc (I guess it was more in the form of a question) -- but for the record... Should m_length() return unsigned? Also, if not, should we fix below in bpf.c where pktlen is unsigned? Cheers, Andrew On Wed, 18 Sep 2002, Poul-Henning Kamp wrote: : :This patch is a no-op which replaces local mbuf-chain counting :loops with calls to m_length() and in one case m_fixhdr(). : :Index: kern/uipc_socket2.c :=================================================================== :RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v :retrieving revision 1.103 :diff -u -r1.103 uipc_socket2.c :--- kern/uipc_socket2.c 16 Aug 2002 18:41:48 -0000 1.103 :+++ kern/uipc_socket2.c 18 Sep 2002 14:08:34 -0000 :@@ -498,11 +498,11 @@ : #ifdef SOCKBUF_DEBUG : void : sbcheck(sb) :- register struct sockbuf *sb; :+ struct sockbuf *sb; : { :- register struct mbuf *m; :- register struct mbuf *n = 0; :- register u_long len = 0, mbcnt = 0; :+ struct mbuf *m; :+ struct mbuf *n = 0; :+ u_long len = 0, mbcnt = 0; : : for (m = sb->sb_mb; m; m = n) { : n = m->m_nextpkt; :@@ -610,22 +610,18 @@ : */ : int : sbappendaddr(sb, asa, m0, control) :- register struct sockbuf *sb; :+ struct sockbuf *sb; : struct sockaddr *asa; : struct mbuf *m0, *control; : { :- register struct mbuf *m, *n; :+ struct mbuf *m, *n; : int space = asa->sa_len; : : if (m0 && (m0->m_flags & M_PKTHDR) == 0) : panic("sbappendaddr"); : if (m0) : space += m0->m_pkthdr.len; :- for (n = control; n; n = n->m_next) { :- space += n->m_len; :- if (n->m_next == 0) /* keep pointer to last control buf */ :- break; :- } :+ space += m_length(control, &n); : if (space > sbspace(sb)) : return (0); : if (asa->sa_len > MLEN) :@@ -657,19 +653,12 @@ : struct sockbuf *sb; : struct mbuf *control, *m0; : { :- register struct mbuf *m, *n; :- int space = 0; :+ struct mbuf *m, *n; :+ int space; : : if (control == 0) : panic("sbappendcontrol"); :- for (m = control; ; m = m->m_next) { :- space += m->m_len; :- if (m->m_next == 0) :- break; :- } :- n = m; /* save pointer to last control buffer */ :- for (m = m0; m; m = m->m_next) :- space += m->m_len; :+ space = m_length(control, &n) + m_length(m0, NULL); : if (space > sbspace(sb)) : return (0); : n->m_next = m0; /* concatenate data to control */ :Index: net/bpf.c :=================================================================== :RCS file: /home/ncvs/src/sys/net/bpf.c,v :retrieving revision 1.94 :diff -u -r1.94 bpf.c :--- net/bpf.c 31 Jul 2002 16:11:32 -0000 1.94 :+++ net/bpf.c 18 Sep 2002 14:18:31 -0000 :@@ -1123,11 +1123,8 @@ : struct bpf_if *bp = ifp->if_bpf; : struct bpf_d *d; : u_int pktlen, slen; :- struct mbuf *m0; : :- pktlen = 0; :- for (m0 = m; m0 != 0; m0 = m0->m_next) :- pktlen += m0->m_len; :+ pktlen = m_length(m, NULL); : : BPFIF_LOCK(bp); : for (d = bp->bif_dlist; d != 0; d = d->bd_next) { :Index: net/if_ppp.c :=================================================================== :RCS file: /home/ncvs/src/sys/net/if_ppp.c,v :retrieving revision 1.83 :diff -u -r1.83 if_ppp.c :--- net/if_ppp.c 19 Aug 2002 19:22:41 -0000 1.83 :+++ net/if_ppp.c 18 Sep 2002 14:19:07 -0000 :@@ -758,7 +758,6 @@ : struct ifqueue *ifq; : enum NPmode mode; : int len; :- struct mbuf *m; : : #ifdef MAC : error = mac_check_ifnet_transmit(ifp, m0); :@@ -851,9 +850,7 @@ : *cp++ = protocol & 0xff; : m0->m_len += PPP_HDRLEN; : :- len = 0; :- for (m = m0; m != 0; m = m->m_next) :- len += m->m_len; :+ len = m_length(m0, NULL); : : if (sc->sc_flags & SC_LOG_OUTPKT) { : printf("ppp%d output: ", ifp->if_unit); :@@ -1087,9 +1084,7 @@ : struct mbuf *mcomp = NULL; : int slen, clen; : :- slen = 0; :- for (mp = m; mp != NULL; mp = mp->m_next) :- slen += mp->m_len; :+ slen = m_length(m, NULL); : clen = (*sc->sc_xcomp->compress) : (sc->sc_xc_state, &mcomp, m, slen, sc->sc_if.if_mtu + PPP_HDRLEN); : if (mcomp != NULL) { :@@ -1324,9 +1319,7 @@ : sc->sc_stats.ppp_ipackets++; : : if (sc->sc_flags & SC_LOG_INPKT) { :- ilen = 0; :- for (mp = m; mp != NULL; mp = mp->m_next) :- ilen += mp->m_len; :+ ilen = m_length(m, NULL); : printf("ppp%d: got %d bytes\n", ifp->if_unit, ilen); : pppdumpm(m); : } :@@ -1389,9 +1382,7 @@ : } : #endif : :- ilen = 0; :- for (mp = m; mp != NULL; mp = mp->m_next) :- ilen += mp->m_len; :+ ilen = m_length(m, NULL); : : #ifdef VJC : if (sc->sc_flags & SC_VJ_RESET) { :Index: netinet/ip_input.c :=================================================================== :RCS file: /home/ncvs/src/sys/netinet/ip_input.c,v :retrieving revision 1.208 :diff -u -r1.208 ip_input.c :--- netinet/ip_input.c 17 Sep 2002 11:20:02 -0000 1.208 :+++ netinet/ip_input.c 18 Sep 2002 13:47:10 -0000 :@@ -1071,12 +1071,8 @@ : m->m_len += (IP_VHL_HL(ip->ip_vhl) << 2); : m->m_data -= (IP_VHL_HL(ip->ip_vhl) << 2); : /* some debugging cruft by sklower, below, will go away soon */ :- if (m->m_flags & M_PKTHDR) { /* XXX this should be done elsewhere */ :- register int plen = 0; :- for (t = m; t; t = t->m_next) :- plen += t->m_len; :- m->m_pkthdr.len = plen; :- } :+ if (m->m_flags & M_PKTHDR) /* XXX this should be done elsewhere */ :+ m_fixhdr(m); : return (m); : : dropfrag: :Index: netns/idp_usrreq.c :=================================================================== :RCS file: /home/ncvs/src/sys/netns/idp_usrreq.c,v :retrieving revision 1.12 :diff -u -r1.12 idp_usrreq.c :--- netns/idp_usrreq.c 31 May 2002 11:52:34 -0000 1.12 :+++ netns/idp_usrreq.c 18 Sep 2002 14:14:06 -0000 :@@ -144,18 +144,12 @@ : register struct mbuf *m; : register struct idp *idp; : register struct socket *so; :- register int len = 0; :+ register int len; : register struct route *ro; : struct mbuf *mprev; : extern int idpcksum; : :- /* :- * Calculate data length. :- */ :- for (m = m0; m; m = m->m_next) { :- mprev = m; :- len += m->m_len; :- } :+ len = m_length(m0, &mprev); : /* : * Make sure packet is actually of even length. : */ :Index: netns/spp_usrreq.c :=================================================================== :RCS file: /home/ncvs/src/sys/netns/spp_usrreq.c,v :retrieving revision 1.16 :diff -u -r1.16 spp_usrreq.c :--- netns/spp_usrreq.c 25 Aug 2002 13:17:35 -0000 1.16 :+++ netns/spp_usrreq.c 18 Sep 2002 14:13:27 -0000 :@@ -687,8 +687,7 @@ : firstbad = m; : /*for (;;) {*/ : /* calculate length */ :- for (m0 = m, len = 0; m ; m = m->m_next) :- len += m->m_len; :+ len = m_length(m); : if (len > cb->s_mtu) { : } : /* FINISH THIS :Index: netsmb/smb_rq.c :=================================================================== :RCS file: /home/ncvs/src/sys/netsmb/smb_rq.c,v :retrieving revision 1.7 :diff -u -r1.7 smb_rq.c :--- netsmb/smb_rq.c 16 Sep 2002 09:51:58 -0000 1.7 :+++ netsmb/smb_rq.c 18 Sep 2002 14:12:50 -0000 :@@ -421,9 +421,7 @@ : m0 = m_split(mtop, offset, M_TRYWAIT); : if (m0 == NULL) : return EBADRPC; :- for(len = 0, m = m0; m->m_next; m = m->m_next) :- len += m->m_len; :- len += m->m_len; :+ len = m_length(m0, &m); : m->m_len -= len - count; : if (mdp->md_top == NULL) { : md_initm(mdp, m0); :Index: nfsclient/nfs_socket.c :=================================================================== :RCS file: /home/ncvs/src/sys/nfsclient/nfs_socket.c,v :retrieving revision 1.86 :diff -u -r1.86 nfs_socket.c :--- nfsclient/nfs_socket.c 8 Sep 2002 15:11:18 -0000 1.86 :+++ nfsclient/nfs_socket.c 18 Sep 2002 14:19:42 -0000 :@@ -869,13 +869,7 @@ : rep->r_vp = vp; : rep->r_td = td; : rep->r_procnum = procnum; :- i = 0; :- m = mrest; :- while (m) { :- i += m->m_len; :- m = m->m_next; :- } :- mrest_len = i; :+ mrest_len = i = m_length(mrest, NULL); : : /* : * Get the RPC header with authorization. :Index: nfsserver/nfs_syscalls.c :=================================================================== :RCS file: /home/ncvs/src/sys/nfsserver/nfs_syscalls.c,v :retrieving revision 1.80 :diff -u -r1.80 nfs_syscalls.c :--- nfsserver/nfs_syscalls.c 24 Jul 2002 23:10:34 -0000 1.80 :+++ nfsserver/nfs_syscalls.c 18 Sep 2002 14:11:36 -0000 :@@ -451,12 +451,7 @@ : nfsrv_updatecache(nd, TRUE, mreq); : nd->nd_mrep = NULL; : case RC_REPLY: :- m = mreq; :- siz = 0; :- while (m) { :- siz += m->m_len; :- m = m->m_next; :- } :+ siz = m_length(mreq, NULL); : if (siz <= 0 || siz > NFS_MAXPACKET) { : printf("mbuf siz=%d\n",siz); : panic("Bad nfs svc reply"); :-- :Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 :phk@FreeBSD.ORG | TCP/IP since RFC 956 :FreeBSD committer | BSD since 4.3-tahoe :Never attribute to malice what can adequately be explained by incompetence. : :To Unsubscribe: send mail to majordomo@FreeBSD.org :with "unsubscribe freebsd-arch" in the body of the message : -- Andrew R. Reiter arr@watson.org arr@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 11:41:48 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 03CB237B401 for ; Wed, 18 Sep 2002 11:41:47 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 225B543E42 for ; Wed, 18 Sep 2002 11:41:46 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IIfZtF005807; Wed, 18 Sep 2002 20:41:44 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Nate Lawson Cc: arch@FreeBSD.ORG Subject: Re: Trivial mbuf patch for review. In-Reply-To: Your message of "Wed, 18 Sep 2002 11:26:51 PDT." Date: Wed, 18 Sep 2002 20:41:35 +0200 Message-ID: <5806.1032374495@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Nate Lawson wri tes: >On Wed, 18 Sep 2002, Poul-Henning Kamp wrote: >> This patch is a no-op which replaces local mbuf-chain counting >> loops with calls to m_length() and in one case m_fixhdr(). >> >> Index: kern/uipc_socket2.c >> =================================================================== >> RCS file: /home/ncvs/src/sys/kern/uipc_socket2.c,v >> retrieving revision 1.103 >> diff -u -r1.103 uipc_socket2.c >> --- kern/uipc_socket2.c 16 Aug 2002 18:41:48 -0000 1.103 >> +++ kern/uipc_socket2.c 18 Sep 2002 14:08:34 -0000 >> @@ -498,11 +498,11 @@ >> #ifdef SOCKBUF_DEBUG >> void >> sbcheck(sb) >> - register struct sockbuf *sb; >> + struct sockbuf *sb; >> { >> - register struct mbuf *m; >> - register struct mbuf *n = 0; >> - register u_long len = 0, mbcnt = 0; >> + struct mbuf *m; >> + struct mbuf *n = 0; >> + u_long len = 0, mbcnt = 0; >> >> for (m = sb->sb_mb; m; m = n) { >> n = m->m_nextpkt; > >Have we agreed to remove "register" from all our code or did you have a >specific reason for doing this here? Other places in the same patch you >leave register in after changing the line. I had to remove it from "n" in order to be able to &n, so I decided to remove it entirely from the function. >> + mrest_len = i = m_length(mrest, NULL); > >Is this initialization accepted style? Well, technically it's an assignment... Actually, "i" doesn't need the value, I'll fix that. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 11:43:35 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BAEDC37B401 for ; Wed, 18 Sep 2002 11:43:34 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id CE7DD43E75 for ; Wed, 18 Sep 2002 11:43:33 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IIhRtF005836; Wed, 18 Sep 2002 20:43:29 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: Julian Elischer Cc: arch@freebsd.org Subject: Re: Trivial mbuf patch for review. In-Reply-To: Your message of "Wed, 18 Sep 2002 10:18:28 PDT." Date: Wed, 18 Sep 2002 20:43:27 +0200 Message-ID: <5835.1032374607@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Jul ian Elischer writes: > > >On Wed, 18 Sep 2002, Poul-Henning Kamp wrote: > > >a fair idea.. >I think m_length could be a maco or an inline.... >It's hardly worth a function call.. On the other hand, as Bruce would probably put it: Only broken code which fails to keep properly track of lengths needs to all m_length() or m_fixhdr() in the first place. As I said in other email: I don't think there is a performance case to be made for inline, and macros are just plain ugly. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Sep 18 11:48:25 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 35E2537B401 for ; Wed, 18 Sep 2002 11:48:24 -0700 (PDT) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4B17A43E65 for ; Wed, 18 Sep 2002 11:48:23 -0700 (PDT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.6/8.12.6) with ESMTP id g8IImLtF005975; Wed, 18 Sep 2002 20:48:22 +0200 (CEST) (envelope-from phk@critter.freebsd.dk) To: "Andrew R. Reiter" Cc: arch@FreeBSD.ORG Subject: Re: Trivial mbuf patch for review. In-Reply-To: Your message of "Wed, 18 Sep 2002 14:30:44 EDT." Date: Wed, 18 Sep 2002 20:48:21 +0200 Message-ID: <5974.1032374901@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , "And rew R. Reiter" writes: > >The only comment I have is the same one I mentioned on irc (I guess it was >more in the form of a question) -- but for the record... > >Should m_length() return unsigned? Also, if not, should we fix below in >bpf.c where pktlen is unsigned? I've changed it to unsigned with no ill effects. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 9: 0:31 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C79CE37B401 for ; Thu, 19 Sep 2002 09:00:27 -0700 (PDT) Received: from storm.FreeBSD.org.uk (storm.FreeBSD.org.uk [194.242.157.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id EAB7943E42 for ; Thu, 19 Sep 2002 09:00:26 -0700 (PDT) (envelope-from mark@grimreaper.grondar.org) Received: from storm.FreeBSD.org.uk (uucp@localhost [127.0.0.1]) by storm.FreeBSD.org.uk (8.12.5/8.12.5) with ESMTP id g8JG0QPb048566 for ; Thu, 19 Sep 2002 17:00:26 +0100 (BST) (envelope-from mark@grimreaper.grondar.org) Received: (from uucp@localhost) by storm.FreeBSD.org.uk (8.12.5/8.12.5/Submit) with UUCP id g8JG0PeY048565 for arch@freebsd.org; Thu, 19 Sep 2002 17:00:25 +0100 (BST) Received: from grimreaper.grondar.org (localhost [127.0.0.1]) by grimreaper.grondar.org (8.12.6/8.12.5) with ESMTP id g8JFx27s001093 for ; Thu, 19 Sep 2002 16:59:02 +0100 (BST) (envelope-from mark@grimreaper.grondar.org) Message-Id: <200209191559.g8JFx27s001093@grimreaper.grondar.org> To: arch@freebsd.org Subject: More lint work for share/mk/ (review, please) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----- =_aaaaaaaaaa0" Content-ID: <1090.1032451109.0@grimreaper.grondar.org> Date: Thu, 19 Sep 2002 16:59:02 +0100 From: Mark Murray Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1090.1032451109.1@grimreaper.grondar.org> Hi all I've been running the attached patches for a while now, and they allow me to complete make worlds with a commercial lint, a ports lint and freebsd "base" lint. WANT_LINT= yes To use a commercial lint, you might put this: LINT= /usr/local/bin/flexelint LINTFLAGS= -b -zero /etc/lint/co-freebsd.lnt LINTKERNFLAGS= -b -zero /etc/lint/co-freebsd-kern.lnt LINTOBJFLAGS= -u -library -b -zero /etc/lint/co-freebsd.lnt -oo\(${.TARGET}\) LINTLIBFLAGS= -u -library -b -zero /etc/lint/co-freebsd.lnt -ol\(${.TARGET}\) ... with suitable values in your /etc/make.conf. Some of this is already present; the patch fixes-and-extends. They also allow parts of the tree to be "notlinted". The mechanism for this may look clunky, and I'll listen to folks who feel that it needs to be done differently. There are two knobs; WANT_LINT, which is a make.conf thing, and turns on overall linting. The other is NOLINT, and this is supposed to be put into individual Makefiles to kill copious lint output from code out of our control (like GNU stuff, other contrib/ code and so on). Comments (remembering this is a no-bikeshed zone!) ?? :-) M -- o Mark Murray \_ O.\_ Warning: this .sig is umop ap!sdn ------- =_aaaaaaaaaa0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1090.1032451109.2@grimreaper.grondar.org> Content-Description: diff Index: bsd.lib.mk =================================================================== RCS file: /home/ncvs/src/share/mk/bsd.lib.mk,v retrieving revision 1.137 diff -u -d -r1.137 bsd.lib.mk --- bsd.lib.mk 17 Sep 2002 01:48:53 -0000 1.137 +++ bsd.lib.mk 17 Sep 2002 08:33:51 -0000 @@ -204,7 +204,7 @@ ${RANLIB} ${.TARGET} .endif -.if defined(WANT_LINT) && defined(LIB) && !empty(LIB) +.if defined(WANT_LINT) && !defined(NOLINT) && defined(LIB) && !empty(LIB) LINTLIB= llib-l${LIB}.ln _LIBS+= ${LINTLIB} LINTOBJS+= ${SRCS:M*.c:.c=.ln} @@ -273,7 +273,7 @@ ${INSTALL} -o ${LIBOWN} -g ${LIBGRP} -m ${LIBMODE} \ ${_INSTALLFLAGS} lib${LIB}_pic.a ${DESTDIR}${LIBDIR} .endif -.if defined(WANT_LINT) && defined(LIB) && !empty(LIB) +.if defined(WANT_LINT) && !defined(NOLINT) && defined(LIB) && !empty(LIB) ${INSTALL} -o ${LIBOWN} -g ${LIBGRP} -m ${LIBMODE} \ ${_INSTALLFLAGS} ${LINTLIB} ${DESTDIR}${LINTLIBDIR} .endif @@ -292,7 +292,7 @@ .if !target(lint) lint: ${SRCS:M*.c} - ${LINT} ${LINTOBJFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} + ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} .endif .if !defined(NOMAN) Index: bsd.prog.mk =================================================================== RCS file: /home/ncvs/src/share/mk/bsd.prog.mk,v retrieving revision 1.127 diff -u -d -r1.127 bsd.prog.mk --- bsd.prog.mk 17 Sep 2002 01:48:53 -0000 1.127 +++ bsd.prog.mk 17 Sep 2002 08:33:51 -0000 @@ -3,7 +3,7 @@ .include -.SUFFIXES: .out .o .c .cc .cpp .cxx .C .m .y .l .s .S .asm +.SUFFIXES: .out .o .c .cc .cpp .cxx .C .m .y .l .ln .s .S .asm CFLAGS+=${COPTS} ${DEBUG_FLAGS} @@ -157,9 +157,9 @@ .endif .if !target(lint) -lint: ${SRCS} +lint: ${SRCS:M*.c} .if defined(PROG) - ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} | more 2>&1 + ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} .endif .endif Index: bsd.sys.mk =================================================================== RCS file: /home/ncvs/src/share/mk/bsd.sys.mk,v retrieving revision 1.10 diff -u -d -r1.10 bsd.sys.mk --- bsd.sys.mk 7 Jul 2002 18:47:52 -0000 1.10 +++ bsd.sys.mk 12 Sep 2002 20:54:10 -0000 @@ -29,7 +29,7 @@ . endif # BDECFLAGS . if ${WARNS} > 5 -CFLAGS += -ansi -pedantic -Wbad-function-cast -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls +CFLAGS += -ansi -pedantic -Wbad-function-cast -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-long-long . endif . if ${WARNS} > 1 && ${WARNS} < 5 # XXX Delete -Wuninitialized by default for now -- the compiler doesn't Index: sys.mk =================================================================== RCS file: /home/ncvs/src/share/mk/sys.mk,v retrieving revision 1.63 diff -u -d -r1.63 sys.mk --- sys.mk 17 Sep 2002 01:48:54 -0000 1.63 +++ sys.mk 17 Sep 2002 08:33:52 -0000 @@ -132,6 +132,14 @@ # DOUBLE SUFFIX RULES +.c.ln: + ${LINT} ${LINTOBJFLAGS} ${CFLAGS:M-[DIU]*} ${.IMPSRC} || \ + touch ${.TARGET} + +.cc.ln .C.ln .cpp.ln .cxx.ln: + ${LINT} ${LINTOBJFLAGS} ${CXXFLAGS:M-[DIU]*} ${.IMPSRC} || \ + touch ${.TARGET} + .c.o: ${CC} ${CFLAGS} -c ${.IMPSRC} @@ -175,6 +183,14 @@ .sh: cp -p ${.IMPSRC} ${.TARGET} chmod a+x ${.TARGET} + +.c.ln: + ${LINT} ${LINTOBJFLAGS} ${CFLAGS:M-[DIU]*} ${.IMPSRC} || \ + touch ${.TARGET} + +.cc.ln .C.ln .cpp.ln .cxx.ln: + ${LINT} ${LINTOBJFLAGS} ${CXXFLAGS:M-[DIU]*} ${.IMPSRC} || \ + touch ${.TARGET} .c: ${CC} ${CFLAGS} ${LDFLAGS} ${.IMPSRC} ${LDLIBS} -o ${.TARGET} ------- =_aaaaaaaaaa0-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 9:30:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1C87F37B401 for ; Thu, 19 Sep 2002 09:30:35 -0700 (PDT) Received: from snipe.mail.pas.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id B3A4743E42 for ; Thu, 19 Sep 2002 09:30:34 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0312.cvx22-bradley.dialup.earthlink.net ([209.179.199.57] helo=mindspring.com) by snipe.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17s43a-0003Q6-00; Thu, 19 Sep 2002 09:21:35 -0700 Message-ID: <3D89F94E.88C3EABB@mindspring.com> Date: Thu, 19 Sep 2002 09:20:30 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mark Murray Cc: arch@freebsd.org Subject: Re: More lint work for share/mk/ (review, please) References: <200209191559.g8JFx27s001093@grimreaper.grondar.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Mark Murray wrote: > I've been running the attached patches for a while now, and they > allow me to complete make worlds with a commercial lint, a ports > lint and freebsd "base" lint. [ ... ] > Comments (remembering this is a no-bikeshed zone!) ?? :-) [ Comments on flexelint's poor understanding of compiler semantics, including varradic macros, and the value of LINT vs. logic errors, and the silly lengths to which one may go to make LINT happy and mask problems instead of fixing them, and how code which compiles cleanly is not the same thing as code which works ... elided to avoid bikeshedding ] -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 13:20:44 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 016CC37B401 for ; Thu, 19 Sep 2002 13:20:42 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id AD89843E6A for ; Thu, 19 Sep 2002 13:20:40 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id GAA12738; Fri, 20 Sep 2002 06:20:31 +1000 Date: Fri, 20 Sep 2002 06:29:10 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Mark Murray Cc: arch@FreeBSD.ORG Subject: Re: More lint work for share/mk/ (review, please) In-Reply-To: <200209191559.g8JFx27s001093@grimreaper.grondar.org> Message-ID: <20020920054221.E2677-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 19 Sep 2002, Mark Murray wrote: > I've been running the attached patches for a while now, and they > allow me to complete make worlds with a commercial lint, a ports > lint and freebsd "base" lint. > ... > They also allow parts of the tree to be "notlinted". The mechanism > for this may look clunky, and I'll listen to folks who feel that it > needs to be done differently. There are two knobs; WANT_LINT, which > is a make.conf thing, and turns on overall linting. The other is > NOLINT, and this is supposed to be put into individual Makefiles > to kill copious lint output from code out of our control (like > GNU stuff, other contrib/ code and so on). Wouldn't `.undef WANT_LINT' in individual Makefiles be better than another knob? > ... % Index: bsd.prog.mk % =================================================================== % RCS file: /home/ncvs/src/share/mk/bsd.prog.mk,v % retrieving revision 1.127 % diff -u -d -r1.127 bsd.prog.mk % --- bsd.prog.mk 17 Sep 2002 01:48:53 -0000 1.127 % +++ bsd.prog.mk 17 Sep 2002 08:33:51 -0000 % .... % @@ -157,9 +157,9 @@ % .endif % % .if !target(lint) % -lint: ${SRCS} % +lint: ${SRCS:M*.c} % .if defined(PROG) % - ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} | more 2>&1 % + ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} I wonder what this pipeline is for. I once thought that it was to combine stdout with stderr to work around the bug that lint prints some messages on stderr, but it is wrong for that -- the correct way to combine stdin with stderr is "2>&1" before any pipe. % .endif % .endif % % Index: bsd.sys.mk % =================================================================== % RCS file: /home/ncvs/src/share/mk/bsd.sys.mk,v % retrieving revision 1.10 % diff -u -d -r1.10 bsd.sys.mk % --- bsd.sys.mk 7 Jul 2002 18:47:52 -0000 1.10 % +++ bsd.sys.mk 12 Sep 2002 20:54:10 -0000 % @@ -29,7 +29,7 @@ % . endif % # BDECFLAGS % . if ${WARNS} > 5 % -CFLAGS += -ansi -pedantic -Wbad-function-cast -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls % +CFLAGS += -ansi -pedantic -Wbad-function-cast -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-long-long % . endif % . if ${WARNS} > 1 && ${WARNS} < 5 % # XXX Delete -Wuninitialized by default for now -- the compiler doesn't No thanks. This breaks gcc's warning. -ansi means C90 (-std=c89), and this standard doesn't have long long. Of course, programs not written in C90 can't be compiled with this warning level. % Index: sys.mk % =================================================================== % RCS file: /home/ncvs/src/share/mk/sys.mk,v % retrieving revision 1.63 % diff -u -d -r1.63 sys.mk % --- sys.mk 17 Sep 2002 01:48:54 -0000 1.63 % +++ sys.mk 17 Sep 2002 08:33:52 -0000 % @@ -132,6 +132,14 @@ % % # DOUBLE SUFFIX RULES % % +.c.ln: % + ${LINT} ${LINTOBJFLAGS} ${CFLAGS:M-[DIU]*} ${.IMPSRC} || \ % + touch ${.TARGET} % + % +.cc.ln .C.ln .cpp.ln .cxx.ln: % + ${LINT} ${LINTOBJFLAGS} ${CXXFLAGS:M-[DIU]*} ${.IMPSRC} || \ % + touch ${.TARGET} % + % .c.o: % ${CC} ${CFLAGS} -c ${.IMPSRC} % Non-POSIX rules shouldn't be duplicated in the POSIX section. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 14:20:24 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 14FB037B401 for ; Thu, 19 Sep 2002 14:20:23 -0700 (PDT) Received: from storm.FreeBSD.org.uk (storm.FreeBSD.org.uk [194.242.157.42]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2AC2F43E6A for ; Thu, 19 Sep 2002 14:20:22 -0700 (PDT) (envelope-from mark@grimreaper.grondar.org) Received: from storm.FreeBSD.org.uk (uucp@localhost [127.0.0.1]) by storm.FreeBSD.org.uk (8.12.5/8.12.5) with ESMTP id g8JLKKPb051070; Thu, 19 Sep 2002 22:20:20 +0100 (BST) (envelope-from mark@grimreaper.grondar.org) Received: (from uucp@localhost) by storm.FreeBSD.org.uk (8.12.5/8.12.5/Submit) with UUCP id g8JLKKu0051069; Thu, 19 Sep 2002 22:20:20 +0100 (BST) Received: from grimreaper.grondar.org (localhost [127.0.0.1]) by grimreaper.grondar.org (8.12.6/8.12.5) with ESMTP id g8JLF57s003495; Thu, 19 Sep 2002 22:15:05 +0100 (BST) (envelope-from mark@grimreaper.grondar.org) Message-Id: <200209192115.g8JLF57s003495@grimreaper.grondar.org> To: Bruce Evans Cc: arch@FreeBSD.ORG Subject: Re: More lint work for share/mk/ (review, please) References: <20020920054221.E2677-100000@gamplex.bde.org> In-Reply-To: <20020920054221.E2677-100000@gamplex.bde.org> ; from Bruce Evans "Fri, 20 Sep 2002 06:29:10 +1000." Date: Thu, 19 Sep 2002 22:15:05 +0100 From: Mark Murray Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > NOLINT, and this is supposed to be put into individual Makefiles > > to kill copious lint output from code out of our control (like > > GNU stuff, other contrib/ code and so on). > > Wouldn't `.undef WANT_LINT' in individual Makefiles be better than > another knob? Maybe. I dunno. There is no precedent for the .undef form, whereas we have lots for the WANT_* or NO_* type of knobs. Seems yukky to do something that requires another syntax. > % - ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} | more 2>&1 > % + ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} > > I wonder what this pipeline is for. I once thought that it was to combine > stdout with stderr to work around the bug that lint prints some messages > on stderr, but it is wrong for that -- the correct way to combine stdin > with stderr is "2>&1" before any pipe. This got discussed to death last time. It is minor and mostly irrelevant; its here for consistency with another similar removal. Locally I just put it back in the .mk file as lint: ${LINT} .... ${.ALLSRC} 2>&1 | cat so I can # make lint | less I don't feel stronly enough to discuss this any further. Take it out, leave it in. Either way suits me. :-) > % -CFLAGS += -ansi -pedantic -Wbad-function-cast -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls > % +CFLAGS += -ansi -pedantic -Wbad-function-cast -Wchar-subscripts -Winline -Wnested-externs -Wredundant-decls -Wno-long-long > % . endif > % . if ${WARNS} > 1 && ${WARNS} < 5 > % # XXX Delete -Wuninitialized by default for now -- the compiler doesn't > > No thanks. This breaks gcc's warning. -ansi means C90 (-std=c89), and > this standard doesn't have long long. Of course, programs not written in > C90 can't be compiled with this warning level. Oops! Mistake. This was not intended for the LINT commit! This is strictly local. > Non-POSIX rules shouldn't be duplicated in the POSIX section. OK. Thanks! M -- o Mark Murray \_ O.\_ Warning: this .sig is umop ap!sdn To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 14:28:42 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E276337B401 for ; Thu, 19 Sep 2002 14:28:40 -0700 (PDT) Received: from obsecurity.dyndns.org (adsl-64-165-226-88.dsl.lsan03.pacbell.net [64.165.226.88]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7DD9343E6E for ; Thu, 19 Sep 2002 14:28:40 -0700 (PDT) (envelope-from kris@obsecurity.org) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id EF2CB66C04; Thu, 19 Sep 2002 14:28:39 -0700 (PDT) Date: Thu, 19 Sep 2002 14:28:38 -0700 From: Kris Kennaway To: Mark Murray Cc: arch@freebsd.org Subject: Re: More lint work for share/mk/ (review, please) Message-ID: <20020919212837.GA77278@xor.obsecurity.org> References: <200209191559.g8JFx27s001093@grimreaper.grondar.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="nFreZHaLTZJo0R7j" Content-Disposition: inline In-Reply-To: <200209191559.g8JFx27s001093@grimreaper.grondar.org> User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --nFreZHaLTZJo0R7j Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Sep 19, 2002 at 04:59:02PM +0100, Mark Murray wrote: > Hi all >=20 > I've been running the attached patches for a while now, and they > allow me to complete make worlds with a commercial lint, a ports > lint and freebsd "base" lint. >=20 > > WANT_LINT=3D yes WANT_* variables are used with different meaning in the ports collection (they are supposed to be internal for use in a port makefile only, and are not user-control knobs). WITH_* and WITHOUT_* are the user namespace for enabling/disabling port features (also NO_* as in /usr/src). Kris --nFreZHaLTZJo0R7j Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.7 (FreeBSD) iD8DBQE9ikGFWry0BWjoQKURAqrfAKClVhBeayK8J3zdWlDjz68WIKIK7wCg3dkt RX0iNhlzqn0dj5RmEElhejI= =310d -----END PGP SIGNATURE----- --nFreZHaLTZJo0R7j-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 14:30:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D41B37B401 for ; Thu, 19 Sep 2002 14:30:31 -0700 (PDT) Received: from InterJet.dellroad.org (adsl-63-194-81-26.dsl.snfc21.pacbell.net [63.194.81.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id 73FA943E6A for ; Thu, 19 Sep 2002 14:30:30 -0700 (PDT) (envelope-from archie@dellroad.org) Received: from arch20m.dellroad.org (arch20m.dellroad.org [10.1.1.20]) by InterJet.dellroad.org (8.9.1a/8.9.1) with ESMTP id OAA37006; Thu, 19 Sep 2002 14:17:57 -0700 (PDT) Received: from arch20m.dellroad.org (localhost [127.0.0.1]) by arch20m.dellroad.org (8.12.6/8.12.6) with ESMTP id g8JLGrrf062183; Thu, 19 Sep 2002 14:16:53 -0700 (PDT) (envelope-from archie@arch20m.dellroad.org) Received: (from archie@localhost) by arch20m.dellroad.org (8.12.6/8.12.6/Submit) id g8JLGq42062182; Thu, 19 Sep 2002 14:16:52 -0700 (PDT) From: Archie Cobbs Message-Id: <200209192116.g8JLGq42062182@arch20m.dellroad.org> Subject: Re: What is "LIBMCHAIN" and why is it in the tree ? In-Reply-To: <3458.1032362998@critter.freebsd.dk> "from Poul-Henning Kamp at Sep 18, 2002 05:29:58 pm" To: Poul-Henning Kamp Date: Thu, 19 Sep 2002 14:16:52 -0700 (PDT) Cc: Bosko Milekic , Ian Dowse , arch@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL88 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp writes: > > libmchain was developed by Boris mainly for his smb et al. code. It > > is pretty useful but could use some eventual optimisation and, more > > importantly, a larger audience. And even more importantly, a man page... -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 14:55:19 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7465037B401 for ; Thu, 19 Sep 2002 14:55:18 -0700 (PDT) Received: from albatross.prod.itd.earthlink.net (albatross.mail.pas.earthlink.net [207.217.120.120]) by mx1.FreeBSD.org (Postfix) with ESMTP id 14DBB43E4A for ; Thu, 19 Sep 2002 14:55:18 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0128.cvx21-bradley.dialup.earthlink.net ([209.179.192.128] helo=mindspring.com) by albatross.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 17s9GM-00077a-00; Thu, 19 Sep 2002 14:55:06 -0700 Message-ID: <3D8A4762.9E6253CE@mindspring.com> Date: Thu, 19 Sep 2002 14:53:38 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mark Murray Cc: Bruce Evans , arch@FreeBSD.ORG Subject: Re: More lint work for share/mk/ (review, please) References: <20020920054221.E2677-100000@gamplex.bde.org> <200209192115.g8JLF57s003495@grimreaper.grondar.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Mark Murray wrote: > > > NOLINT, and this is supposed to be put into individual Makefiles > > > to kill copious lint output from code out of our control (like > > > GNU stuff, other contrib/ code and so on). > > > > Wouldn't `.undef WANT_LINT' in individual Makefiles be better than > > another knob? > > Maybe. I dunno. There is no precedent for the .undef form, whereas > we have lots for the WANT_* or NO_* type of knobs. Seems yukky to > do something that requires another syntax. I'm a big fan of the general rule: "Any patch which does not break exisiting functionality, nor does it preclude future work, is OK with me." In other words, as long as it doesn't screw anything up, it should be committed. You can argue "style changes" under "unreadability precludes future work", if you want to, but as long as it doesn't screw with anyone's ability to do future work in the same code, then quit discussing the the damn thing, and commit it already. The "NOLINT" vs. "NO_LINT" vs. ".undef WANT_LINT" thing is something someone else could do, later, if they felt really strongly about the idea. If no one feels strongly, then fine, let nothing happen. If someone does, then fine, then let it happen *later*. The person who writes the code gets to decide what the code does. I'm surprised that this was even discussed at all on this list, as it doesn't impact functionality, and permits *more* uses than were previously permitted. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 15:29:15 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2E63B37B401 for ; Thu, 19 Sep 2002 15:29:14 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3888743E75 for ; Thu, 19 Sep 2002 15:29:13 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id IAA24603; Fri, 20 Sep 2002 08:29:02 +1000 Date: Fri, 20 Sep 2002 08:37:42 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Mark Murray Cc: arch@FreeBSD.ORG Subject: Re: More lint work for share/mk/ (review, please) In-Reply-To: <200209192115.g8JLF57s003495@grimreaper.grondar.org> Message-ID: <20020920083332.E3384-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 19 Sep 2002, Mark Murray wrote: > > % - ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} | more 2>&1 > > % + ${LINT} ${LINTFLAGS} ${CFLAGS:M-[DIU]*} ${.ALLSRC} > > > > I wonder what this pipeline is for. I once thought that it was to combine > > stdout with stderr to work around the bug that lint prints some messages > > on stderr, but it is wrong for that -- the correct way to combine stdin > > with stderr is "2>&1" before any pipe. > > This got discussed to death last time. It is minor and mostly irrelevant; > its here for consistency with another similar removal. Locally I just > put it back in the .mk file as I disagreed with that removal too, and put it back in some places (mainly as a reminder to fix it properly). > lint: > ${LINT} .... ${.ALLSRC} 2>&1 | cat > > so I can > > # make lint | less > > I don't feel stronly enough to discuss this any further. Take it > out, leave it in. Either way suits me. :-) The "| cat" part brings out my anti-useless process reflex :-). Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 16:16:55 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5599A37B401 for ; Thu, 19 Sep 2002 16:16:54 -0700 (PDT) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5DAED43E3B for ; Thu, 19 Sep 2002 16:16:53 -0700 (PDT) (envelope-from bde@zeta.org.au) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id JAA30872; Fri, 20 Sep 2002 09:16:32 +1000 Date: Fri, 20 Sep 2002 09:25:12 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: Archie Cobbs Cc: Poul-Henning Kamp , Bosko Milekic , Ian Dowse , Subject: Re: What is "LIBMCHAIN" and why is it in the tree ? In-Reply-To: <200209192116.g8JLGq42062182@arch20m.dellroad.org> Message-ID: <20020920092425.R3405-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 19 Sep 2002, Archie Cobbs wrote: > Poul-Henning Kamp writes: > > > libmchain was developed by Boris mainly for his smb et al. code. It > > > is pretty useful but could use some eventual optimisation and, more > > > importantly, a larger audience. > > And even more importantly, a man page... What's wrong with mbchain(9) and mdchain(9)? :-). Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 17:45:21 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8637537B401 for ; Thu, 19 Sep 2002 17:45:19 -0700 (PDT) Received: from InterJet.dellroad.org (adsl-63-194-81-26.dsl.snfc21.pacbell.net [63.194.81.26]) by mx1.FreeBSD.org (Postfix) with ESMTP id E077843E65 for ; Thu, 19 Sep 2002 17:45:18 -0700 (PDT) (envelope-from archie@dellroad.org) Received: from arch20m.dellroad.org (arch20m.dellroad.org [10.1.1.20]) by InterJet.dellroad.org (8.9.1a/8.9.1) with ESMTP id RAA38230; Thu, 19 Sep 2002 17:29:24 -0700 (PDT) Received: from arch20m.dellroad.org (localhost [127.0.0.1]) by arch20m.dellroad.org (8.12.6/8.12.6) with ESMTP id g8K0SKrf062935; Thu, 19 Sep 2002 17:28:20 -0700 (PDT) (envelope-from archie@arch20m.dellroad.org) Received: (from archie@localhost) by arch20m.dellroad.org (8.12.6/8.12.6/Submit) id g8K0SIA1062934; Thu, 19 Sep 2002 17:28:18 -0700 (PDT) From: Archie Cobbs Message-Id: <200209200028.g8K0SIA1062934@arch20m.dellroad.org> Subject: Re: What is "LIBMCHAIN" and why is it in the tree ? In-Reply-To: <20020920092425.R3405-100000@gamplex.bde.org> "from Bruce Evans at Sep 20, 2002 09:25:12 am" To: Bruce Evans Date: Thu, 19 Sep 2002 17:28:18 -0700 (PDT) Cc: Archie Cobbs , Poul-Henning Kamp , Bosko Milekic , Ian Dowse , arch@FreeBSD.ORG X-Mailer: ELM [version 2.4ME+ PL88 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bruce Evans writes: > > > > libmchain was developed by Boris mainly for his smb et al. code. It > > > > is pretty useful but could use some eventual optimisation and, more > > > > importantly, a larger audience. > > > > And even more importantly, a man page... > > What's wrong with mbchain(9) and mdchain(9)? :-). Too hard to find?? Just kidding, you win :-) -Archie __________________________________________________________________________ Archie Cobbs * Packet Design * http://www.packetdesign.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 20:38: 3 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3D97337B401 for ; Thu, 19 Sep 2002 20:38:02 -0700 (PDT) Received: from gnuppy.monkey.org (wsip68-15-8-100.sd.sd.cox.net [68.15.8.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0566343E42 for ; Thu, 19 Sep 2002 20:38:02 -0700 (PDT) (envelope-from billh@gnuppy.monkey.org) Received: from billh by gnuppy.monkey.org with local (Exim 3.36 #1 (Debian)) id 17sEFL-0000u3-00; Thu, 19 Sep 2002 20:14:23 -0700 Date: Thu, 19 Sep 2002 20:14:23 -0700 To: freebsd-arch@freebsd.org Cc: "Bill Huey (Hui)" Subject: New Linux threading model Message-ID: <20020920031423.GA3380@gnuppy.monkey.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4i From: Bill Huey (Hui) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Hello, I got this off of lkml: http://marc.theaimsgroup.com/?l=linux-kernel&m=103248252713576&w=2 paper: http://people.redhat.com/drepper/nptl-design.pdf They basically went to (kept) a 1:1 threading model, but added a bunch of things to the kernel so that stuff like signal handling, pid, thread suspension via signal notification, etc... are all very conformant to Posix threading now. In their paper, they talk briefly about how they came to the decision that 1:1 is better than M:N and why they chose that against variants of M:N including scheduler activations, a cross process fast-path synchronization primitive called "futexes", etc... bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 21: 7:17 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BBEE537B401 for ; Thu, 19 Sep 2002 21:07:16 -0700 (PDT) Received: from mail.pcnet.com (pcnet1.pcnet.com [204.213.232.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4A6D643E3B for ; Thu, 19 Sep 2002 21:07:16 -0700 (PDT) (envelope-from eischen@pcnet1.pcnet.com) Received: from localhost (eischen@localhost) by mail.pcnet.com (8.12.3/8.12.1) with ESMTP id g8K47FGi003325; Fri, 20 Sep 2002 00:07:15 -0400 (EDT) Date: Fri, 20 Sep 2002 00:07:15 -0400 (EDT) From: Daniel Eischen To: Bill Huey Cc: freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model In-Reply-To: <20020920031423.GA3380@gnuppy.monkey.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 19 Sep 2002, Bill Huey wrote: > > Hello, > > I got this off of lkml: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=103248252713576&w=2 > > paper: > http://people.redhat.com/drepper/nptl-design.pdf > > They basically went to (kept) a 1:1 threading model, but added a bunch of > things to the kernel so that stuff like signal handling, pid, thread suspension > via signal notification, etc... are all very conformant to Posix threading > now. > > In their paper, they talk briefly about how they came to the decision that > 1:1 is better than M:N and why they chose that against variants of M:N > including scheduler activations, a cross process fast-path synchronization > primitive called "futexes", etc... I read some of this and some of it is exactly opposite of why scheduler activations was made in the first place. They are pushing all scheduling decisions and locking in to the kernel. One of the points of scheduler activations is that the library can make all scheduling decisions without need for having the kernel involved. -- Dan Eischen To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Sep 19 21:10:17 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2D17C37B401 for ; Thu, 19 Sep 2002 21:10:16 -0700 (PDT) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id E1AAF43E4A for ; Thu, 19 Sep 2002 21:10:15 -0700 (PDT) (envelope-from bright@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1192) id A52CEAE2EF; Thu, 19 Sep 2002 21:10:15 -0700 (PDT) Date: Thu, 19 Sep 2002 21:10:15 -0700 From: Alfred Perlstein To: Daniel Eischen Cc: Bill Huey , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model Message-ID: <20020920041015.GW86737@elvis.mu.org> References: <20020920031423.GA3380@gnuppy.monkey.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Daniel Eischen [020919 21:07] wrote: > On Thu, 19 Sep 2002, Bill Huey wrote: > > > > > Hello, > > > > I got this off of lkml: > > > > http://marc.theaimsgroup.com/?l=linux-kernel&m=103248252713576&w=2 > > > > paper: > > http://people.redhat.com/drepper/nptl-design.pdf > > > > They basically went to (kept) a 1:1 threading model, but added a bunch of > > things to the kernel so that stuff like signal handling, pid, thread suspension > > via signal notification, etc... are all very conformant to Posix threading > > now. > > > > In their paper, they talk briefly about how they came to the decision that > > 1:1 is better than M:N and why they chose that against variants of M:N > > including scheduler activations, a cross process fast-path synchronization > > primitive called "futexes", etc... > > I read some of this and some of it is exactly opposite of why > scheduler activations was made in the first place. They are > pushing all scheduling decisions and locking in to the kernel. > One of the points of scheduler activations is that the library > can make all scheduling decisions without need for having > the kernel involved. Well it is a step forward for threading on Linnex. :) Still not out of the fire yet, but good work nonetheless. -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 0:20: 9 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BADF637B401 for ; Fri, 20 Sep 2002 00:20:07 -0700 (PDT) Received: from sccrmhc03.attbi.com (sccrmhc03.attbi.com [204.127.202.63]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3A04343E77 for ; Fri, 20 Sep 2002 00:20:07 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc03.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020920072006.VFST28420.sccrmhc03.attbi.com@InterJet.elischer.org>; Fri, 20 Sep 2002 07:20:06 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id AAA17182; Fri, 20 Sep 2002 00:08:39 -0700 (PDT) Date: Fri, 20 Sep 2002 00:08:38 -0700 (PDT) From: Julian Elischer To: Bill Huey Cc: freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: <20020920031423.GA3380@gnuppy.monkey.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG HI and thanks for the pointers. it's interesting that the features that they talk about as being difficult and 'required' generally just "fall out" of the KSE implimentation. A lot of the shortcomings of M:N that they quote don't apply to the KSE schemes either.. for example: They talk a lot about making signals per-process (uber-process that is). Signals in KSE are automatically per process. They talk of the difficulty of making SIGSTOP work, and are very very proud that they have it working over the entire uber-process, yet KSE has had this working since the very first test program. I still think they are dressing up a Bull to make it fit in in the chicken coop. The sad thing is that we'll have to implement their kernel side stuff for binary compatibility, so we'll have chickens AND a bull dressed like a bunch of chickens. Market share has it's advantages. Of course We can emulate them but they'll never be able to emulate us. You can make a bunch of chickens look like anything a bull dressed as a bunch of chickens can look like, but the converse is not true without killing the bull. :-) Hell, if we've really screwed up, hey there's always 4.7 as a base :-) What they have decided to do is not a stupid move. But I disagree with some of their assertions. I first heard that method of doing threads expounded by Kirk in the BSD4.4 internals class in 1992 (or was that 91?). It sounded feasible then and it is stiff feasible. In fact I remember discussing it with Linus once a long time ago at a USENIX forum. (before he became hard to find under the crowds). I just happen to think that what we have will be very sweet when you see it working. On Thu, 19 Sep 2002, Bill Huey wrote: > > Hello, > > I got this off of lkml: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=103248252713576&w=2 > > paper: > http://people.redhat.com/drepper/nptl-design.pdf > > They basically went to (kept) a 1:1 threading model, but added a bunch of > things to the kernel so that stuff like signal handling, pid, thread suspension > via signal notification, etc... are all very conformant to Posix threading > now. > > In their paper, they talk briefly about how they came to the decision that > 1:1 is better than M:N and why they chose that against variants of M:N > including scheduler activations, a cross process fast-path synchronization > primitive called "futexes", etc... > > bill > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 1:28:35 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A2AB837B401 for ; Fri, 20 Sep 2002 01:28:34 -0700 (PDT) Received: from gnuppy.monkey.org (wsip68-15-8-100.sd.sd.cox.net [68.15.8.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 48EC343E42 for ; Fri, 20 Sep 2002 01:28:34 -0700 (PDT) (envelope-from billh@gnuppy.monkey.org) Received: from billh by gnuppy.monkey.org with local (Exim 3.36 #1 (Debian)) id 17sJ9I-00017p-00; Fri, 20 Sep 2002 01:28:28 -0700 Date: Fri, 20 Sep 2002 01:28:28 -0700 To: Daniel Eischen Cc: freebsd-arch@FreeBSD.ORG, "Bill Huey (Hui)" Subject: Re: New Linux threading model Message-ID: <20020920082828.GA4207@gnuppy.monkey.org> References: <20020920031423.GA3380@gnuppy.monkey.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i From: Bill Huey (Hui) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Sep 20, 2002 at 12:07:15AM -0400, Daniel Eischen wrote: > I read some of this and some of it is exactly opposite of why > scheduler activations was made in the first place. They are > pushing all scheduling decisions and locking in to the kernel. > One of the points of scheduler activations is that the library > can make all scheduling decisions without need for having > the kernel involved. I wasn't quite sure how to break this to them without being completely impolite. They did some measurements, but I'm curious how something like thread performance (context switching, blocking) in libc_r measures against their 1:1 model. It should be simple to write a test program to check it out and see what kind of result they get. bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 3: 4:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2239437B401 for ; Fri, 20 Sep 2002 03:04:43 -0700 (PDT) Received: from gnuppy.monkey.org (wsip68-15-8-100.sd.sd.cox.net [68.15.8.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9FB2543E65 for ; Fri, 20 Sep 2002 03:04:42 -0700 (PDT) (envelope-from billh@gnuppy.monkey.org) Received: from billh by gnuppy.monkey.org with local (Exim 3.36 #1 (Debian)) id 17sKeN-0001De-00; Fri, 20 Sep 2002 03:04:39 -0700 Date: Fri, 20 Sep 2002 03:04:39 -0700 To: Julian Elischer Cc: freebsd-arch@freebsd.org, "Bill Huey (Hui)" Subject: Re: New Linux threading model Message-ID: <20020920100439.GB4207@gnuppy.monkey.org> References: <20020920031423.GA3380@gnuppy.monkey.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i From: Bill Huey (Hui) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Sep 20, 2002 at 12:08:38AM -0700, Julian Elischer wrote: > HI and thanks for the pointers. > > it's interesting that the features that they talk about > as being difficult and 'required' generally just "fall out" of the > KSE implimentation. A lot of the shortcomings of M:N that they > quote don't apply to the KSE schemes either.. Mingo's O(1) scheduler is pretty snazzy ( high brow technical term ;) ) with how it migrates/load balances tasks between various CPUs, maintains cache coherency, and their threading model can get immediate benefits from that feature on SMP machines. There's a lot of design tradeoffs that still need to be measured and balanced against one an other. Those are kind of unknowns in KSEs/libc_r at this point in time until folks get to that point. > They talk a lot about making signals per-process (uber-process that is). > Signals in KSE are automatically per process. > They talk of the difficulty of making SIGSTOP work, and are very > very proud that they have it working over the entire > uber-process, yet KSE has had this working since the > very first test program. I still think they are dressing up a Bull > to make it fit in in the chicken coop. How ? The bull being heavy weight kernel threading verses chickens being userspace threading ? Yeah, I'm curious how they implemented their signal handling myself and how their implementation differs from KSEs. It's badly needed and I'm glad they finally have this stuff. > The sad thing is that we'll have to implement their kernel side stuff > for binary compatibility, so we'll have chickens AND a bull dressed > like a bunch of chickens. Market share has it's advantages. > Of course We can emulate them but they'll never be able to emulate us. > You can make a bunch of chickens look like anything a bull dressed as > a bunch of chickens can look like, but the converse is not true without > killing the bull. > :-) Dude, that was PROFOUND. I think all that bad east bay area acid finally got to you from all that utter animal confusion. ;) either that or all of that plush toy abuse is finally making you pay the price. ;) The good thing about their stuff is that any future Linux emulator can be simplified, since they will finally conform to Posix signal handling and we'll get more Linux compatibility in that case automatically. That's assuming it gets implemented, of course. The obnoxious "SIGCHLD to signify thread death" stuff will finally be gone and other good things will replace it. Software with very heavy threading components like the JVM/HotSpot (ported and maintained by yours truely in FreeBSD ;) ) and Apache 2 can then be greatly simplified because of that... > Hell, if we've really screwed up, hey there's always 4.7 as a base :-) Na, I think the IO related upcalls are golden, but the preemption ones worry me because of the overhead. There are possible ways of getting around that which we talked about in the past (NetBSD dude at Usenix), shared event queue of some sort between the kernel and the UTS (userspace threading system) to be polled when ever the thread-kern gets entered... Who knows. I've got something brewing in my head about this, but I'm not comfortable enough to articulate it just yet... something with timers emulated in userspace driving the UTS at a frequency high enough to accurately sample the event status of the virtual processors. This is replace upcalls in that situation... Don't know, I could be utterly wacked about this, just thinking to myself out loud. > What they have decided to do is not a stupid move. But I disagree with It's a good political move on their part because of the orientation of their kernel community. Their kernel context switching time is very fast, 2x faster than NetBSD from what I saw, so it's probably a workable solution for them with something like their "futex" performance being the only funny question left unanswered. Our libc_r is kind of unfair in that category. ;) And I need to bitch about that to them, well, just because I'm that way. :) > some of their assertions. I first heard that method of doing threads > expounded by Kirk in the BSD4.4 internals class in 1992 (or was that > 91?). It sounded feasible then and it is stiff feasible. In fact > I remember discussing it with Linus once a long time ago at a USENIX > forum. (before he became hard to find under the crowds). We'll see, good luck !!! > I just happen to think that what we have will be very sweet when you see > it working. bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 7:17:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C42CB37B401 for ; Fri, 20 Sep 2002 07:17:27 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7E60143E42 for ; Fri, 20 Sep 2002 07:17:23 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:38274 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 11:17:00 -0300 Date: Fri, 20 Sep 2002 11:16:57 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Bill Huey Cc: Julian Elischer , Subject: Re: New Linux threading model In-Reply-To: <20020920100439.GB4207@gnuppy.monkey.org> Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Bill Huey wrote: > On Fri, Sep 20, 2002 at 12:08:38AM -0700, Julian Elischer wrote: > > HI and thanks for the pointers. > > > > it's interesting that the features that they talk about > > as being difficult and 'required' generally just "fall out" of the > > KSE implimentation. A lot of the shortcomings of M:N that they > > quote don't apply to the KSE schemes either.. > > Mingo's O(1) scheduler is pretty snazzy ( high brow technical term ;) ) > with how it migrates/load balances tasks between various CPUs, maintains > cache coherency, What's maybe more important about the O(1) scheduler is that it doesn't try to recalculate the priority of all processes once in a while, like the FreeBSD scheduler and the old Linux scheduler. There don't seem to be any O(n) loops left in or near this scheduler, meaning that 1:1 threading with lots of threads becomes possible. > > What they have decided to do is not a stupid move. But I disagree with > > It's a good political move on their part because of the orientation of > their kernel community. Their kernel context switching time is very > fast, 2x faster than NetBSD from what I saw, so it's probably a workable > solution for them with something like their "futex" performance being > the only funny question left unanswered. Futexes are very nice. In the uncontended case (should be the normal case, if your semaphores are always contended you've got worse problems) there is NO kernel overhead involved in grabbing the lock ... you just do the same atomic operations involved with grabbing a spinlock. Only in the contended case will a futex fall back to sleeping in kernel space. This kind of very low overhead locking might be useful for FreeBSD too, if it isn't yet integrated into the KSE model. As for which threading model to use ... I wouldn't worry about that too much, I suspect either the Linux 1:1 model or the M:N model used by KSE will work just fine for pretty much all applications. cheers, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 7:43:25 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 88DDA37B4AD for ; Fri, 20 Sep 2002 07:43:17 -0700 (PDT) Received: from mail.speakeasy.net (mail16.speakeasy.net [216.254.0.216]) by mx1.FreeBSD.org (Postfix) with ESMTP id 13E1743E88 for ; Fri, 20 Sep 2002 07:43:16 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 29424 invoked from network); 20 Sep 2002 14:45:00 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail16.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 20 Sep 2002 14:45:00 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.5/8.12.5) with ESMTP id g8KEhEBv069356; Fri, 20 Sep 2002 10:43:14 -0400 (EDT) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.2 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: Date: Fri, 20 Sep 2002 10:43:16 -0400 (EDT) From: John Baldwin To: Rik van Riel Subject: Re: New Linux threading model Cc: freebsd-arch@freebsd.org, Julian Elischer , Bill Huey Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 20-Sep-2002 Rik van Riel wrote: > On Fri, 20 Sep 2002, Bill Huey wrote: >> On Fri, Sep 20, 2002 at 12:08:38AM -0700, Julian Elischer wrote: >> > HI and thanks for the pointers. >> > >> > it's interesting that the features that they talk about >> > as being difficult and 'required' generally just "fall out" of the >> > KSE implimentation. A lot of the shortcomings of M:N that they >> > quote don't apply to the KSE schemes either.. >> >> Mingo's O(1) scheduler is pretty snazzy ( high brow technical term ;) ) >> with how it migrates/load balances tasks between various CPUs, maintains >> cache coherency, > > What's maybe more important about the O(1) scheduler is that it > doesn't try to recalculate the priority of all processes once > in a while, like the FreeBSD scheduler and the old Linux scheduler. Yes, schedcpu() needs to die die die and be replaced by a more event-driven model. :) -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 10:13:23 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4E74E37B401 for ; Fri, 20 Sep 2002 10:13:21 -0700 (PDT) Received: from web11207.mail.yahoo.com (web11207.mail.yahoo.com [216.136.131.189]) by mx1.FreeBSD.org (Postfix) with SMTP id 126A643E4A for ; Fri, 20 Sep 2002 10:13:21 -0700 (PDT) (envelope-from gathorpe79@yahoo.com) Message-ID: <20020920171317.77976.qmail@web11207.mail.yahoo.com> Received: from [142.204.207.93] by web11207.mail.yahoo.com via HTTP; Fri, 20 Sep 2002 13:13:17 EDT Date: Fri, 20 Sep 2002 13:13:17 -0400 (EDT) From: Gary Thorpe Subject: Re: New Linux threading model To: Rik van Riel Cc: freebsd-arch@freebsd.org In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --- Rik van Riel wrote: [...] > What's maybe more important about the O(1) scheduler > is that it > doesn't try to recalculate the priority of all > processes once > in a while, like the FreeBSD scheduler and the old > Linux scheduler. So what happens if a processes has been sitting in the queue waiting for a very long time: depending of the scheduling algorithm, it may need to have its priority increased the longer it waits but this will not happen...until it is scheduled? Can this lead to starvation? I.e. will a waiting process never have its priority increased enough to be scheduled because it needs to be scheduled in order to have its priority increased? > > There don't seem to be any O(n) loops left in or > near this scheduler, > meaning that 1:1 threading with lots of threads > becomes possible. The maximum parallelism on a given SMP system can never be more than the number of CPUs so wouldn't a 1 : 1 model lead to unnecessary overhead? Why have hundreds of kernel threads when the system can only run two or four in parallel? Even the largest SMP machines (not NUMA machines like SGI's Origin) don't have more than a hundred cpus. > > > > What they have decided to do is not a stupid > move. But I disagree with > > > > It's a good political move on their part because > of the orientation of > > their kernel community. Their kernel context > switching time is very > > fast, 2x faster than NetBSD from what I saw, so > it's probably a workable > > solution for them with something like their > "futex" performance being > > the only funny question left unanswered. > > Futexes are very nice. In the uncontended case > (should be the > normal case, if your semaphores are always contended > you've got > worse problems) there is NO kernel overhead involved > in grabbing > the lock ... you just do the same atomic operations > involved with > grabbing a spinlock. > > Only in the contended case will a futex fall back to > sleeping in > kernel space. > > This kind of very low overhead locking might be > useful for FreeBSD > too, if it isn't yet integrated into the KSE model. > > As for which threading model to use ... I wouldn't > worry about that > too much, I suspect either the Linux 1:1 model or > the M:N model used > by KSE will work just fine for pretty much all > applications. > > cheers, > > Rik > -- > Bravely reimplemented by the knights who say "NIH". > > http://www.surriel.com/ > http://distro.conectiva.com/ > > Spamtraps of the month: september@surriel.com > trac@trac.org ______________________________________________________________________ Post your free ad now! http://personals.yahoo.ca To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 11: 4:49 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EC0A137B401 for ; Fri, 20 Sep 2002 11:04:42 -0700 (PDT) Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 591C043E75 for ; Fri, 20 Sep 2002 11:04:42 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0108.cvx21-bradley.dialup.earthlink.net ([209.179.192.108] helo=mindspring.com) by avocet.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sS8S-0002wo-00; Fri, 20 Sep 2002 11:04:13 -0700 Message-ID: <3D8B62DB.C27B7E07@mindspring.com> Date: Fri, 20 Sep 2002 11:03:07 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Daniel Eischen Cc: Bill Huey , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Daniel Eischen wrote: > On Thu, 19 Sep 2002, Bill Huey wrote: > > http://marc.theaimsgroup.com/?l=linux-kernel&m=103248252713576&w=2 > > http://people.redhat.com/drepper/nptl-design.pdf > > I read some of this and some of it is exactly opposite of why > scheduler activations was made in the first place. They are > pushing all scheduling decisions and locking in to the kernel. > One of the points of scheduler activations is that the library > can make all scheduling decisions without need for having > the kernel involved. Please take these comments in the context of "what this paper means to FreeBSD", rather than as a criticism of the Linux implementation; I'll point out what I think are fallacies, when I see them, but the main emphasis has to be "what does this mean to FreeBSD-arch"... I find it incredibly amusing that in their 10 listed goals for the system, the original design goal that resulted in the idea of threads becoming widespread in the first place, the idea that there would be less *runtime* overhead for using threads instead of processes, seems to have been lost. Now threads has other goals, but hadware scaling and NUMA support didn't even make the top 5. I also fail to see how they have addressed the NUMA goal, except tangentially, through the elimination of the "helper thread". FreeBSD must first consider that its "top ten goals" for a threads implementation are different than those of Linux, and see the Linux design in light of that. Here are my comments; they are in order against the content of the paper, so it should be easy to read the comments linearly, with the paper in hand. The argument about collaborative scheduling being a reason to abandon an N:M model for a 1:1 model is, I think, specious. It assumes certain things about the kernel implementation of the kernel threads scheduling. Specifically, it tries to duck the thread group affinity issue, and simplify the CPU affinity issue. Perhaps this is the reason that the original reduced runtime overhead vis-a-vis processes goal has been abandoned: the use of explicit code in the scheduler to try to achieve this is an N-P incomplete problem. The thread group affinity can be addressed the the use of scheduler activations; but the CPU affinity for kernel threads -- and the CPU *negaffinity* for kernel threads which are members of the same thread group -- can not really be adequately addressed in the context of intentional scheduling and CPU migration. In other words, they can not be addressed in the kernel scheduler, they *must* be addressed from a data-flow perspective (hence my occasional advocacy of per CPU scheduling queues, with queue migration being a relatively rare, and also a pushed-based event, rather than a relatively prequent, and also a pull-based event). By going to 1:1, they dodge this issue; but in so doing, they increase threads overhead to that of processes: threads become a mechanism for sharing some resources: the moral equivalent of the old rfork()/sfork() system calls for heap and descriptor space sharing, with seperate stacks. The procfs issue is really a red herring; specifically, it should have been possible to change the procfs interface so that 100,000 PIDs had a log2(100,000)+1, O(1), lookup time - LL[1]. The kernel space overhead that they describe with regard to the user space threads scheduler context switch is predicated on the idea of a context switch for a segmentation register, %FS or %GS, which can not be set in user space, being a protected resource, and so requires a kernel call in order to provide access for the implementation of thread local storage chosen. I am not personally familiar with the FreeBSD scontext(2) call that has been discussed as an implementation detail with regard to some of the recent postings by Daniel Eischen, in which he implied Jonathan Mini will be implementing the call. Perhaps I should have been watching this more closely, so that I could comment more intelligently on what it's supposed to do for the FreeBSD implementation which can not be done without: the Linux people may have a valid argument here which FreeBSD should heed. However... such a call is necessary to deal with register windows on SPARC processors, in any case, and the implementation is moot by way of the fact that trading a user context switch plus a kernel call for a kernel context switch hardly reduces overhead. I think the "the Linux scheduler is O(1), so we don't have to care" argument is invalid; O(1) means that for N entries, there need to be N elements traversed 1 time. What this really means is that, as the number of schedulable entitites goes up, the time required goes up linearly, as opposed to going up exponentially; or, better, to *not* going up in the first place. The signal handling argument is rather specious. The first thing that should be pointed out in this regard is that the overhead involved in signal processing being funneled through a single thread presumes an implementation of a monitor thread, which is not necessarily a requirement. But even granting that argument for a naieve implementation, signals are, in a threaded application, exceptional conditions: they are not intended to be used as IPC mechanisms, and, in fact, the POSIX definition of the interaction of signals and threads grants sufficient lattitude that they are not in fact useful for this purpose in portable software. Instead, POSIX defines a large number of new facilities to be used for IPC in a threaded environment, such that signals are not necessary. The second thing that should be pointed out is that delivery of signals is to a particular *kernel* thread. While this means that deliver to user space, it does not mean delivery to a specific user space thread, which may then be a bottleneck; rather it means delivery in order to communicate the signal to the user space scheduler -- from which it may then be appropriately dispatched. So in the N:M case, if N >> M, the relative overhead may approach this overhead, the common implementation is the have M ~= the total number of simultaneous blocking contexts, plus one. So while it's true that N != M, it *isn't* true that N ~= M, and it isn't true that N >> M. I think that FreeBSD can learn from the Linux analysis of the signal handling, but that it should *not* learn the implementation detail of the 1:1 relationship between kernel and user threads. The paper outlines almost all of the important issues in signal handling, but that does not necessarily mean that the solution chosen is the most appropriate solution for FreeBSD. The "helper thread" is a concept which first appeared in the SVR4 derived Solaris ~2.4 era, and was echoed into the SVR4.2 threads implementation. The need for this thread is an artifact of a limited kernel upcall capability, with the need for the user space to execute code on behalf of kernel events. I agree that a "helper thread" can be a significant bottleneck; but a design need not utilize a helper thread. The main purpose of such a thread is to intermediate in the creation and completion of other threads. The reason this is necessary is the fact that the kernel is permitted to block operations in times of low resources, and one of these operations so blocked may be the creation of new threads, etc.. The problem which this is solving is the lay of an asynchronous means of posting a request to the kernel, yet continuing to run in user space, despite the fact that the request posted may not complete immediately. In other words, it is a call conversion mechanism intended to deal with the fact that certain calls are blocking calls which have no asychronous counterparts for use in call conversion: NB: For the uninitiated: call conversion is a means of implementing a threads system, by trading a blocking call for a non-blocking call plus a context switch; obviously, this does not work, if there is no non-blocking equivalent for the blocking call in question. FreeBSD recently had to deal with this issue with the sendfile(2) system call, which has poor semantics relative to the use of socket descriptors that have been marked as non-blocking. Alternative ways of dealing with this problem are an asynchronous call gate (which is the most general possible mechanism, but needs a new system call entry point), and scheduler activations. In other words, getting rid of the "helper thread" does not require the 1:1 model adoption, as the paper implies. However, the paper *also* implies that it is necessary to go to this model to avoid the serialization (though it does not come out and say this explicitly); this implication is incorrect. Editorial Note: I will note for the record that FreeBSD is now considering an alternate system call entry point for future work, to permit the ABI to change over time for the 64 bit conversion, so that the prior ABI can be dropped at some point in the future. The desire to *NOT do this* was what lead to the choice of scheduler activations instead of an async call gate mechanism in FreeBSD's KSE implementation. I would *STRONGLY* recommend that, should a new ABI system call entry point come to pass, that it include an additional initial parameter for an async call context, which, if NULL, is then treated as a request for the call to be synchronous. In section 5.4, they argue traversal overhead. They argue that the list of threads must be traversed by the "helper thread" when a process exits, and that having the kernel do this "if process exist" [SIC - should be "exits"] is somehow less trouble -- but a traversal is a traversal. Eventually, they conclude that the list must remain for fork(2) handling in any case. FreeBSD could learn from a number of optimizations in this area; I personally do not like the generation count approach, and would want to use the space for a reverse pointer list linkage instead (as one example), but the point about the places there are overhead is a valid one. The "futex" implementation is a good idea. I have long been an advocate against the use of recursive locking, because, IMO, it makes programmers lazy, and leads to extra locking which should be avoided. It also has significantly higher overhead, compared to non-reentrant versions. I dislike locking overhead, particularly when it's gratuitous (IMO). FreeBSD could learn a lot here. I like the DTV optimization in the memory allocation arena. I've advocated a similar approach in the VFS, where the vnode and the per VFS data structure would be allocated at the same time, with the vnode ownership belonging to the VFS, rather than being a global system resource that's contended among all VFS's. The freelist on deallocation is an obvious optimization. They imply the existance of a high watermark facility for the freelist being needed, but don't specify whether or not this is actually implemented. IMO, this is mostly an issue for test programs, and for changing load characteristics for a single machine (e.g. an HTTP server that gets hit very hard, starts 100,000 threads, and then doesn't free the memory back to the system, and then is expected to act as a database server, instead, and needs to reassign resources -- particularly memory -- to the new use). In practice, this is probably not an isue, but it's probably a nice benchmark optimization that isn't really otherwise harmful for performance on real loads, so it's something to consider. As for the kernel improvements, FreeBSD could learn optimizations from them. I don't think that this is a critical area for 5.0, but it's probably something that needs to be addressed for 5.X, going forward. The forking optimizations are particularly salient, I think. The signal handling is, I think, self-serving: it's a strawman that justifies the design decisions they've already made. As I pointed out already, signals are not the preferred IPC mechanism, in any sense, in the POSIX threads environment; I will go further: I claim that an interactive program which must deal with STOP/CONT will in fact almost never have more that 5-9 threads, based on the fact that it's interacting with a human which (as an average) can only keep 5-9 things in short term memory at a time... contexts above and behyond that in an interactive program are simply bad design, and even if we increase that number to 20, it's not a significant source of overhead. Most of the other changes are what I would class as houskeeping tasks. One exception is the use "futex" wakeup in order to improve thread joins: FreeBSD should look closely at this. All in all, quite the interesting paper. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 11:20: 3 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3C34137B401 for ; Fri, 20 Sep 2002 11:20:02 -0700 (PDT) Received: from falcon.mail.pas.earthlink.net (falcon.mail.pas.earthlink.net [207.217.120.74]) by mx1.FreeBSD.org (Postfix) with ESMTP id D25FB43E6E for ; Fri, 20 Sep 2002 11:20:01 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0108.cvx21-bradley.dialup.earthlink.net ([209.179.192.108] helo=mindspring.com) by falcon.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sSNL-0001pB-00; Fri, 20 Sep 2002 11:19:35 -0700 Message-ID: <3D8B6675.C116587F@mindspring.com> Date: Fri, 20 Sep 2002 11:18:29 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Bill Huey (Hui)" Cc: Daniel Eischen , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model References: <20020920031423.GA3380@gnuppy.monkey.org> <20020920082828.GA4207@gnuppy.monkey.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Bill Huey (Hui)" wrote: > They did some measurements, but I'm curious > how something like thread performance (context switching, blocking) > in libc_r measures against their 1:1 model. It should be simple > to write a test program to check it out and see what kind of > result they get. The correct approach is to run a number of instances of the program equal to the number of CPU's + 1, simultaneously, and take the total run time. The issue here is going to be that, on a quiesecent system, you do not have contention for the CPU except threads in the same process, if you run only one threaded test program. The result of running only one test program will therefore not show the effects of TLB shootdown that come from the threads all existing in the same scheduling contention domain with other processes on the system. This would lead to really good microbenchmark results, but really poor real-world performance. The need for "number of CPU's + 1" instances is to prevent the same masking of the effect of negaffinity, and the masking of the effect of contention on a naieve affinity algorithm. In theory, a 1:1 implementation should degrade exponentially in contention, and this effect will be further increased by inter-CPU scheduling. The net weefect of the test I have suggested should be most evident by comparing the number of inter-CPU and intra-CPU migrations which take place (assuming they have statistics counters for this). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 12:12:52 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6DA8E37B404 for ; Fri, 20 Sep 2002 12:12:50 -0700 (PDT) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0875943E42 for ; Fri, 20 Sep 2002 12:12:50 -0700 (PDT) (envelope-from baka@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1921) id BE52AAE162; Fri, 20 Sep 2002 12:12:44 -0700 (PDT) Date: Fri, 20 Sep 2002 12:12:44 -0700 From: Jon Mini To: Terry Lambert Cc: Daniel Eischen , Bill Huey , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model Message-ID: <20020920191244.GY24394@elvis.mu.org> References: <3D8B62DB.C27B7E07@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3D8B62DB.C27B7E07@mindspring.com> User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Terry Lambert [tlambert2@mindspring.com] wrote : > I am not personally familiar with the FreeBSD scontext(2) call > that has been discussed as an implementation detail with regard > to some of the recent postings by Daniel Eischen, in which he > implied Jonathan Mini will be implementing the call. It's not scontext(2), but setcontext(2) -- of Solaris fame. Currnetly, we have {get,set,swap}context(3), but being in userland causes some interesting race conditions. Really, these functions needs to be atomic from the process's perspective, and since they have to call sigprocmask(2) anyways, the best solution is to just move them into the kernel. This is how Solaris does it, among others. > Perhaps I > should have been watching this more closely, so that I could > comment more intelligently on what it's supposed to do for the > FreeBSD implementation which can not be done without: the Linux > people may have a valid argument here which FreeBSD should heed. > However... such a call is necessary to deal with register windows > on SPARC processors, in any case, and the implementation is moot > by way of the fact that trading a user context switch plus a > kernel call for a kernel context switch hardly reduces overhead. Under KSE, we needn't consult the kernel for thread context swaps, because we can enter a critical section and avoid the race conditions endemic with setcontext(2). Also, we don't modify the process signal mask when we swap thread contexts, so we don't need to call sigprocmask(2). > I think the "the Linux scheduler is O(1), so we don't have to > care" argument is invalid; O(1) means that for N entries, there > need to be N elements traversed 1 time. What this really means > is that, as the number of schedulable entitites goes up, the time > required goes up linearly, as opposed to going up exponentially; > or, better, to *not* going up in the first place. Terry? You must have misspoken here. O(N) is linear, O(1) is constant. > One exception is the use > "futex" wakeup in order to improve thread joins: FreeBSD should > look closely at this. "Futexes" are not new. We had this at Be, but we called them Bennaphores. -- Jonathan Mini http://www.freebsd.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 12:25: 0 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D483937B406 for ; Fri, 20 Sep 2002 12:24:58 -0700 (PDT) Received: from qmail.anet.net.th (qmail.anet.net.th [203.148.255.86]) by mx1.FreeBSD.org (Postfix) with SMTP id E4E7043E7B for ; Fri, 20 Sep 2002 12:24:52 -0700 (PDT) (envelope-from MZ00080315@anet.net.th) Received: (qmail 36985 invoked by uid 0); 20 Sep 2002 19:23:58 -0000 Received: from unknown (HELO ME) (210.203.182.129) by qmail.anet.net.th with SMTP; 20 Sep 2002 19:23:58 -0000 From: MZ00080315@anet.net.th Subject:˘ÍĂşˇÇą mail ˘Í§¤Řł ŞčÇ·ÓáşşĘÍş¶ŇÁ ˘Íş¤ŘłÁҡ¤čĐ X-Priority: 1 (Highest) Reply-To: in_formation88@yahoo.com X-Mailer: Microsoft Outlook Express 5.00.2615.200 MIME-Version: 1.0 Content-type: multipart/mixed; boundary="#MYBOUNDARY#" Message-Id: <20020920192452.E4E7043E7B@mx1.FreeBSD.org> Date: Fri, 20 Sep 2002 12:24:52 -0700 (PDT) To: undisclosed-recipients: ; Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --#MYBOUNDARY# Content-Type: text/plain; charset=ansi Content-Transfer-Encoding: 8bit áşşĘÍş¶ŇÁŕˇŐčÂǡѺáąÇâąéÁˇŇĂ·Ó§Ňąăą»Ő 2002 1.˘łĐąŐé¤Řł ? ?ˇÓĹѧČÖˇÉŇÍÂŮč ? ·Ó§Ňą»ĂĐ¨Ó ?ˇÓĹѧËҧҹ ? ˇÓĹѧČÖˇÉŇáĹĐ·Ó §Ňąä»´éÇ µÍş ................................................................................................... 2.¤ŘłľÍ㨡Ѻ§Ňąăą»Ń¨¨ŘşŃąŕľŐ§㴠?? ľÍă¨Áҡ ? »ŇąˇĹҧ ? ŕş×čͧҹ·Őč·ÓÍÂŮč µÍş ................................................................................................... 3. ¤ŘłľÍ㨡ѺĂŇÂä´é ł »Ń¨¨ŘşŃąËĂ×ÍäÁč ?? ľÍ㨠? äÁčľÍ㨠µÍş ................................................................................................... 4.¤ŘłµéͧˇŇĂĂŇÂä´éĘ٧ĘŘ´µčÍŕ´×͹㹪ŐÇÔµˇŇ÷ӧҹŕ·čŇă´ ? ___________________şŇ·/ ŕ´×Íą µÍş ................................................................................................... 5.§Ňą»Ń¨¨ŘşŃąĘŇÁŇöăËéĂŇÂä´éµŇÁ˘éÍ 4 ËĂ×ÍäÁč ?? ä´é ? äÁčä´é µÍş ................................................................................................... 6.¤Řł¤Ô´ÇčҤس(ËĂ×ͤĂÍş¤ĂŃÇ)ä´éĂŃşĽĹµÍşá·ą¤ŘéÁ¤čҡѺáç§ŇąËĂ×ÍäÁč ?? ¤ŘéÁ¤čŇ ? ¤ÇĂä´éĂŃşÁҡˇÇčŇąŐé µÍş ................................................................................................... 7.§Ňą»Ń¨¨ŘşŃą˘Í§¤Řł (ËĂ×ͤĂÍş¤ĂŃÇ) ÁŐ¤ÇŇÁÁŃ蹤§ÁҡąéÍÂŕľŐ§㴠?? Áҡ ? ąéÍ ? äÁčÁŃ蹤§ µÍş ................................................................................................... 8.¤ŘłµéͧăŞéŕÇĹŇ㹡ŇĂŕ´Ôą·Ň§ä»·Ó§Ňą /ŕĂŐÂąŕ·čŇă´ (ä»áĹСĹŃş) ?? ·Ó§Ňą·Őč şéŇą ? ąéÍ¡ÇčŇ 1 ŞÁ. ? 1-2 ŞÁ. ? 2-3 ŞÁ. ? ÁҡˇÇčŇ 3 ŞÁ µÍş ................................................................................................... 9.¤ŘłˇÓĹѧÁͧËŇĹŮč·Ň§ăąˇŇĂËŇĂŇÂä´éľÔŕČÉ·Őč¶ŮˇµéͧáĹĐÁŃ蹤§ÍÂŮčăŞčäËÁ ?? ăŞč ? äÁčăŞč µÍş ................................................................................................... 10.¤ŘłµéͧˇŇĂÁŐ¸ŘáԨĘčÇąµŃÇËĂ×ÍäÁč ?? µéͧˇŇĂ ? äÁčµéͧˇŇĂ ? ÁŐ¸ŘáԨÍÂŮčáĹéÇ µÍş ................................................................................................... 11.¤ŘłÁŐ¤ÇŇÁĂŮé·Ň§´éŇą Internet ËĂ×ÍäÁč ?? ÁŐ¤ÇŇÁĂŮéŕ»çąÍÂčҧ´Ő ? ÁŐ¤ÇŇÁĂŮéşéҧ ? äÁčÁŐ¤ÇŇÁĂŮéŕĹ µÍş ................................................................................................... 12.¤ŘłĂŮé¨ŃˇĂĐşşˇŇ÷ӧҹ¨Ňˇ·ŐčşéŇą ËĂ×ÍäÁč ?? äÁčĂŮé¨Ńˇ ? ĂŮé¨Ńˇ 0¨Ňˇ_________________ µÍş ................................................................................................... 13.¤ŘłĘąă¨ˇŇĂÍşĂÁáĹĐŕĂŐÂąĂŮé "ÇÔ¸ŐˇŇĂĘĂéҧĂŇÂä´é¨ŇˇˇŇ÷ӧҹ·ŐčşéŇą" â´ÂäÁčŕĘŐ ¤čŇăŞé¨čŇÂËĂ×ÍäÁč ?? ʹ㨅……..ˇĂŘłŇŕĹ×͡ŕÇĹŇ·Őč¤ŘłĘдǡ㹢éÍ 14 ? äÁčʹ㨅.….˘Íş¤Řł¤čĐ ·ŐčăËé¤ÇŇÁĂčÇÁÁ×Í㹡ŇõͺẺĘÍş¶ŇÁ˘Í§ŕĂŇ µÍş ................................................................................................... 14.ŕÇĹŇă´µčÍ仹ŐéĘдǡ·ŐčĘش㹡ŇĂ·Őč¤Řł¨Đŕ˘éŇĂŃşˇŇĂÍşĂÁ˘Í§ŕĂŇ ? ? Íѧ¤ŇĂ - 18:30 ą. - 20:00 ą.? ľÄËŃĘş´Ő - 18:30 ą. - 20:00 ą. ? ŕĘŇĂě - 12:30 ą. - 14:00 ą.? Í×čąć â»Ă´ĂĐşŘ______________________ 15. ¤ŘłľŃˇÍÂŮč㹡ĂŘ§ŕ·ľĎ ËĂ×ͨѧËÇŃ´__________________ µÍş ................................................................................................... Ş×čÍ ............................................ ÍŇÂŘ ................................»Ő ÍŇŞŐľ ..........................................â·Ă ........................ŕÇĹŇĘдǡ㹡ŇõԴµčÍ ………… Please unsubscribe sent mail to in_formation88@yahoo.com --#MYBOUNDARY#-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 12:40:15 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 25A7437B401 for ; Fri, 20 Sep 2002 12:40:13 -0700 (PDT) Received: from rwcrmhc53.attbi.com (rwcrmhc53.attbi.com [204.127.198.39]) by mx1.FreeBSD.org (Postfix) with ESMTP id C6AE443E42 for ; Fri, 20 Sep 2002 12:40:12 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc53.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020920194012.OODP8126.rwcrmhc53.attbi.com@InterJet.elischer.org>; Fri, 20 Sep 2002 19:40:12 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA20258; Fri, 20 Sep 2002 12:30:53 -0700 (PDT) Date: Fri, 20 Sep 2002 12:30:52 -0700 (PDT) From: Julian Elischer To: Rik van Riel Cc: Bill Huey , freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Rik van Riel wrote: > On Fri, 20 Sep 2002, Bill Huey wrote: > > On Fri, Sep 20, 2002 at 12:08:38AM -0700, Julian Elischer wrote: > What's maybe more important about the O(1) scheduler is that it > doesn't try to recalculate the priority of all processes once > in a while, like the FreeBSD scheduler and the old Linux scheduler. > > There don't seem to be any O(n) loops left in or near this scheduler, > meaning that 1:1 threading with lots of threads becomes possible. The FreeBSD scheduler is moving towards a big rewrite but we want to change "one thing at a time" :-) in that area.. > > > > What they have decided to do is not a stupid move. But I disagree with > > > > It's a good political move on their part because of the orientation of > > their kernel community. Their kernel context switching time is very > > fast, 2x faster than NetBSD from what I saw, so it's probably a workable > > solution for them with something like their "futex" performance being > > the only funny question left unanswered. > > Futexes are very nice. In the uncontended case (should be the > normal case, if your semaphores are always contended you've got > worse problems) there is NO kernel overhead involved in grabbing > the lock ... you just do the same atomic operations involved with > grabbing a spinlock. We will probaly have to implementn futexes, at least we will have to implement the kernel side of it, so that linux emulation can continue to work. Having done that We'll probabyl make a similar user-landn side available for the threading libraries to use too. > > Only in the contended case will a futex fall back to sleeping in > kernel space. > > This kind of very low overhead locking might be useful for FreeBSD > too, if it isn't yet integrated into the KSE model. > > As for which threading model to use ... I wouldn't worry about that > too much, I suspect either the Linux 1:1 model or the M:N model used > by KSE will work just fine for pretty much all applications. That is true and it will be a very intersting experiment to see how the corner-cases work out... > > cheers, > > Rik > -- > Bravely reimplemented by the knights who say "NIH". > > http://www.surriel.com/ http://distro.conectiva.com/ > > Spamtraps of the month: september@surriel.com trac@trac.org > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 12:52:22 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6479837B401 for ; Fri, 20 Sep 2002 12:52:20 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3BEF343E42 for ; Fri, 20 Sep 2002 12:52:18 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:44233 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 16:51:57 -0300 Date: Fri, 20 Sep 2002 16:51:56 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Gary Thorpe Cc: freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: <20020920171317.77976.qmail@web11207.mail.yahoo.com> Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Gary Thorpe wrote: > So what happens if a processes has been sitting in the > queue waiting for a very long time: depending of the > scheduling algorithm, it may need to have its priority > increased the longer it waits but this will not > happen...until it is scheduled? Can this lead to > starvation? Nope, the O(1) scheduler has some tricks to avoid starvation. The main thing here is that it uses 2 priority arrays, one for tasks that have time in their current time slice left (current) and one or tasks that have run out of time slice (expired). As long as the tasks on the expired queue aren't being starved yet, interactive tasks hop on and off the current priority array and the expired tasks just sit there ... ... but once the oldest task on the expired array has been sitting there for some time (default is 3 seconds, I think) the kernel will add newly woken up tasks to the expired array. It will keep running the tasks from the current array until the time slices are all gone and then it switches arrays, running tasks from the expired array. No starvation. > > There don't seem to be any O(n) loops left in or > > near this scheduler, > > meaning that 1:1 threading with lots of threads > > becomes possible. > > The maximum parallelism on a given SMP system can > never be more than the number of CPUs so wouldn't a 1 > : 1 model lead to unnecessary overhead? Your threads never block on IO ? Or page faults, for that matter ? regards, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 12:56:10 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 29E6D37B401 for ; Fri, 20 Sep 2002 12:56:09 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id B61E843E65 for ; Fri, 20 Sep 2002 12:56:04 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:33994 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 16:55:53 -0300 Date: Fri, 20 Sep 2002 16:55:52 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Julian Elischer Cc: Bill Huey , Subject: Re: New Linux threading model In-Reply-To: Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Julian Elischer wrote: > On Fri, 20 Sep 2002, Rik van Riel wrote: > > There don't seem to be any O(n) loops left in or near this scheduler, > > meaning that 1:1 threading with lots of threads becomes possible. > > The FreeBSD scheduler is moving towards a big rewrite but we want to > change "one thing at a time" :-) in that area.. This is doable in a smallish number of steps, which don't even need to be done in this order: 1) per-cpu runqueues instead of a global one, which wants ... 2) ... load balancer between these per-cpu queues 3) two runqueue arrays (current and expired) instead of just one, which enables ... 4) ... event-driver priority recalculation, instead of recalculating the priority of each task separately These changes are probably small enough that they can be done without the risk of destabilising anything. Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 13: 0:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2188E37B404 for ; Fri, 20 Sep 2002 13:00:16 -0700 (PDT) Received: from sccrmhc03.attbi.com (sccrmhc03.attbi.com [204.127.202.63]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5781143E65 for ; Fri, 20 Sep 2002 13:00:15 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc03.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020920200014.MQHV28420.sccrmhc03.attbi.com@InterJet.elischer.org>; Fri, 20 Sep 2002 20:00:14 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA20419; Fri, 20 Sep 2002 12:59:57 -0700 (PDT) Date: Fri, 20 Sep 2002 12:59:57 -0700 (PDT) From: Julian Elischer To: Rik van Riel Cc: Gary Thorpe , freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Rik van Riel wrote: > On Fri, 20 Sep 2002, Gary Thorpe wrote: > > > So what happens if a processes has been sitting in the > > queue waiting for a very long time: depending of the > > scheduling algorithm, it may need to have its priority > > increased the longer it waits but this will not > > happen...until it is scheduled? Can this lead to > > starvation? > > Nope, the O(1) scheduler has some tricks to avoid > starvation. The main thing here is that it uses > 2 priority arrays, one for tasks that have time in > their current time slice left (current) and one > or tasks that have run out of time slice (expired). > > As long as the tasks on the expired queue aren't > being starved yet, interactive tasks hop on and off > the current priority array and the expired tasks > just sit there ... > > ... but once the oldest task on the expired array > has been sitting there for some time (default is 3 > seconds, I think) the kernel will add newly woken > up tasks to the expired array. > > It will keep running the tasks from the current > array until the time slices are all gone and then > it switches arrays, running tasks from the expired > array. > > No starvation. > > > > There don't seem to be any O(n) loops left in or > > > near this scheduler, > > > meaning that 1:1 threading with lots of threads > > > becomes possible. > > > > The maximum parallelism on a given SMP system can > > never be more than the number of CPUs so wouldn't a 1 > > : 1 model lead to unnecessary overhead? > > Your threads never block on IO ? Or page faults, for that matter ? Ah yes but in KSE, just the thread blocks and an upcall gives the CPU back to the userland scheduler to run another thread shoudl ther ebe one ready.. > > regards, > > Rik > -- > Bravely reimplemented by the knights who say "NIH". > > http://www.surriel.com/ http://distro.conectiva.com/ > > Spamtraps of the month: september@surriel.com trac@trac.org > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 13: 5:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 70A9137B401 for ; Fri, 20 Sep 2002 13:05:38 -0700 (PDT) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 314A343E65 for ; Fri, 20 Sep 2002 13:05:38 -0700 (PDT) (envelope-from baka@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1921) id 048ECAE03F; Fri, 20 Sep 2002 13:05:38 -0700 (PDT) Date: Fri, 20 Sep 2002 13:05:37 -0700 From: Jon Mini To: Julian Elischer Cc: Terry Lambert , Daniel Eischen , Bill Huey , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model Message-ID: <20020920200537.GZ24394@elvis.mu.org> References: <20020920191244.GY24394@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Julian Elischer [julian@elischer.org] wrote : > > > One exception is the use > > > "futex" wakeup in order to improve thread joins: FreeBSD should > > > look closely at this. > > > > "Futexes" are not new. We had this at Be, but we called them Bennaphores. > > > after "Ben"? Bennoit Schillings. He's french, so I'm probably spelling his last name wrong. He came up with the idea, so we named it after him. It addressed the very important issue that all semaphores in BeOS were kernel semaphores (BeOS used 1:1 threading) and 99% of the time (or worse) obtaining a semaphore didn't contend. So we wrapped the kernel sems with an atomic int. One nice thing about KSE is that all locking operations can be done in userland, and it saves a lot of this mess. -- Jonathan Mini http://www.freebsd.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 13:16:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4E73037B401 for ; Fri, 20 Sep 2002 13:16:10 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id EA00043E6A for ; Fri, 20 Sep 2002 13:16:06 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:39119 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 17:15:50 -0300 Date: Fri, 20 Sep 2002 17:15:48 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Terry Lambert Cc: Daniel Eischen , Bill Huey , Subject: Re: New Linux threading model In-Reply-To: <3D8B62DB.C27B7E07@mindspring.com> Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Terry Lambert wrote: > I think the "the Linux scheduler is O(1), so we don't have to > care" argument is invalid; O(1) means that for N entries, there > need to be N elements traversed 1 time. No, it means that for N schedulable entities, the system still only needs to traverse 1, Ingo's scheduler really is O(1). It's a pretty neat scheduler and it's probably possible to port this thing to FreeBSD in a few steps of manageable size. regards, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 13:20:12 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 36BAD37B41E for ; Fri, 20 Sep 2002 13:20:10 -0700 (PDT) Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id A961E43E65 for ; Fri, 20 Sep 2002 13:20:09 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc02.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020920202008.MJEU14454.sccrmhc02.attbi.com@InterJet.elischer.org>; Fri, 20 Sep 2002 20:20:08 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id NAA20498; Fri, 20 Sep 2002 13:08:26 -0700 (PDT) Date: Fri, 20 Sep 2002 13:08:25 -0700 (PDT) From: Julian Elischer To: Rik van Riel Cc: Bill Huey , freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Rik van Riel wrote: > On Fri, 20 Sep 2002, Julian Elischer wrote: > > On Fri, 20 Sep 2002, Rik van Riel wrote: > > > There don't seem to be any O(n) loops left in or near this scheduler, > > > meaning that 1:1 threading with lots of threads becomes possible. > > > > The FreeBSD scheduler is moving towards a big rewrite but we want to > > change "one thing at a time" :-) in that area.. > > This is doable in a smallish number of steps, which don't > even need to be done in this order: > > 1) per-cpu runqueues instead of a global one, which wants ... > > 2) ... load balancer between these per-cpu queues > > 3) two runqueue arrays (current and expired) instead of > just one, which enables ... > > 4) ... event-driver priority recalculation, instead of > recalculating the priority of each task separately > I didn't sat it's not possible, it's just that the interaction between threads and KSEs on the run queue is very complicated in the current "interim" scheduler (compatible with the old process scheduler but with a huge "tumor" on the side of it to do something with threads) and there are not enough people who know the issues (I know of 3) for us to spare the time to do it now aswe are all busy on other things. The redesign can wait untill we acually have some threads to schedule :-) > These changes are probably small enough that they can be done > without the risk of destabilising anything. That would be a new peocess scheduler.. we need a new THREAD scheduler.. ie. You need to schedule threads in the kernel, while not allowing a process with a lot of threads to flood the system. This is non trivil, but we have the tools needed to do it. We just haven't done so yet.. If anyone is lookign for a grad-student project.. contact me on this one. I can give details :-) > Rik > -- > Bravely reimplemented by the knights who say "NIH". > > http://www.surriel.com/ http://distro.conectiva.com/ > > Spamtraps of the month: september@surriel.com trac@trac.org > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 13:53: 5 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AAB8337B401; Fri, 20 Sep 2002 13:53:02 -0700 (PDT) Received: from snipe.mail.pas.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3D70543E3B; Fri, 20 Sep 2002 13:53:02 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0248.cvx40-bradley.dialup.earthlink.net ([216.244.42.248] helo=mindspring.com) by snipe.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sUln-0004J7-00; Fri, 20 Sep 2002 13:53:00 -0700 Message-ID: <3D8B8A63.9B3DE20B@mindspring.com> Date: Fri, 20 Sep 2002 13:51:47 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Jon Mini Cc: Daniel Eischen , Bill Huey , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model References: <3D8B62DB.C27B7E07@mindspring.com> <20020920191244.GY24394@elvis.mu.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Jon Mini wrote: > It's not scontext(2), but setcontext(2) -- of Solaris fame. Currnetly, > we have {get,set,swap}context(3), but being in userland causes some > interesting race conditions. Really, these functions needs to be > atomic from the process's perspective, and since they have to call > sigprocmask(2) anyways, the best solution is to just move them into > the kernel. This is how Solaris does it, among others. Well, it's *a* solution, anyway. ;^). The reason Solaris does it, though, is because it can't know about the existant register frames when it comes to a push, and so it has to put an explicit, rather than implicit, stall barrier in there to make sure. Otherwise, you would need to unwind the context switches in the reverse order they were originally made. The Keppel paper has details: http://citeseer.nj.nec.com/keppel91register.html Register Windows and User-Space Threads on the SPARC David Keppel Department of Computer Science and Engineering University of Washington > Under KSE, we needn't consult the kernel for thread context swaps, > because we can enter a critical section and avoid the race conditions > endemic with setcontext(2). Also, we don't modify the process signal > mask when we swap thread contexts, so we don't need to call > sigprocmask(2). Which kind of begs the question of why it needs to be there, or be called, which is what I was saying. I think there are legitimate reasons for having it, so it's not avoidable, like the Linux paper implies, but I don't agree with the Linux reasons they say it's needed for N:M threads. You argument here invalidates a couple of theirs, fromt he paper, but not all of them. > > I think the "the Linux scheduler is O(1), so we don't have to > > care" argument is invalid; O(1) means that for N entries, there > > need to be N elements traversed 1 time. What this really means > > is that, as the number of schedulable entitites goes up, the time > > required goes up linearly, as opposed to going up exponentially; > > or, better, to *not* going up in the first place. > > Terry? You must have misspoken here. O(N) is linear, O(1) is constant. See Rik's posting; My N in this case is not the N in N:M, it's what Rik's calling 'n'. I've upcased it to make it visually distinct in my text; sorry if that confused things. Over the set of all processes, it *is* a linear algorithm. Scheduling the next thing to run is not as interesting as scheduling the thing you are descheduling now so that it's run *again*. The distance that needs consideration is the distance between the times that it's scheduled. If you think about this in the context of my microbenchmarking comments, this should be more clear. > > One exception is the use > > "futex" wakeup in order to improve thread joins: FreeBSD should > > look closely at this. > > "Futexes" are not new. We had this at Be, but we called them Bennaphores. I didn't mean looking closely at it as a new technology, I meant looking closely at it because the current FreeBSD recursion-able mutex implementation is really too heavy weight for the problem at hand. The "futex" (or "bennaphore" or whatever) implementation differs in that it has significantly lower overhead, with the cost being that you can't just regrab a lock, and expect it to be magically counted up and down. If you've ever programmed timer code in the Windows 95/98/NT/XP/2000 kernels, the timers basically run on whatever kernel thread is available to run on, rather than a specific thread (kernel threads only provide context). This basically means that you have to build non-reentrant semaphores on top of the kernel services that are already there, or you can grab a semaphore in a normal operation, have a timer fire, and, even though it's technically a seperate context, in theory, in application, you end up being allowed to grab a semaphore that is already grabbed by the kernel context that the timer is "borrowing" to run itself. Matt Day ran into this with the soft updates syncer in our port of the Heidemann stacking VFS code to Windows 95 (different soft updates implementation than Kirk's code; it predates Kirks work by a couple of years). The upshot is that things you think are protected aren't really protected, under certain conditions that, while uncommon, are still possible. My personal preference is for the tradeoff that Linux made here, where they ate the code refactoring overhead implied by failure to permit recursive acquisition. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 13:55: 3 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 19FB837B401 for ; Fri, 20 Sep 2002 13:55:02 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id C7E6143E6E for ; Fri, 20 Sep 2002 13:55:00 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:27866 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 17:54:45 -0300 Date: Fri, 20 Sep 2002 17:54:43 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Julian Elischer Cc: Bill Huey , Subject: Re: New Linux threading model In-Reply-To: Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Julian Elischer wrote: > I didn't sat it's not possible, it's just that the interaction between > threads and KSEs on the run queue is very complicated in the current > "interim" scheduler (compatible with the old process scheduler but with > a huge "tumor" on the side of it to do something with threads) > ie. You need to schedule threads in the kernel, while not allowing > a process with a lot of threads to flood the system. Interesting problem, but that might be better done in a more generic way. Ie. first build a thread scheduler and then add support for generic resource containers that aren't tied to the thread<->process relation. Once you have that, you could substitute thread<->user for the default relationship and prevent users with many threads from flooding the CPU ;) Adding resource containers to a scheduler can be hard though, I still haven't found a pretty way of adding per-container (in my case I want to start with per-user) CPU time accounting to Ingo's O(1) scheduler. Sure, I've got several ugly ideas and one less ugly idea, but I haven't found anything nice yet... regards, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 14: 9:44 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 69BB837B401 for ; Fri, 20 Sep 2002 14:09:38 -0700 (PDT) Received: from scaup.mail.pas.earthlink.net (scaup.mail.pas.earthlink.net [207.217.120.49]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1601143E3B for ; Fri, 20 Sep 2002 14:09:38 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0248.cvx40-bradley.dialup.earthlink.net ([216.244.42.248] helo=mindspring.com) by scaup.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sV1b-00064F-00; Fri, 20 Sep 2002 14:09:20 -0700 Message-ID: <3D8B8E35.EDAF4450@mindspring.com> Date: Fri, 20 Sep 2002 14:08:05 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Rik van Riel Cc: Julian Elischer , Bill Huey , freebsd-arch@freebsd.org Subject: Re: New Linux threading model References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Rik van Riel wrote: > On Fri, 20 Sep 2002, Julian Elischer wrote: > > On Fri, 20 Sep 2002, Rik van Riel wrote: > > > There don't seem to be any O(n) loops left in or near this scheduler, > > > meaning that 1:1 threading with lots of threads becomes possible. > > > > The FreeBSD scheduler is moving towards a big rewrite but we want to > > change "one thing at a time" :-) in that area.. > > This is doable in a smallish number of steps, which don't > even need to be done in this order: > > 1) per-cpu runqueues instead of a global one, which wants ... Yes. > 2) ... load balancer between these per-cpu queues No. This is the problem with the Linux model. With an active load balancing algorithm, you end up with global contention for scheduler queue locks. In reality, you can get away from having *any* scheduler queue locks *at all*, for the normal case, and then only contend at the inter-CPU level for the CPUs using a push migration model. This basically depends on being able to do a "lockless list is empty test", which you can do with a simple pointer compare. The design is that the per CPU scheduler checks the push list for processed which it needs to account for in its scheduler, but which are not in the per CPU scheduler queue yet. If the pointer is non-NULL, then it locks access to the pointer, and drains the list into the local scheduler queue. For the push case, the process(es) to be migrated off the current CPU are selected, and then, based on a figure of merit which is only ever written by the CPU to which it applies, and which is atomically readable by all other CPUs, such that no locking is required, a "target" CPU is identified. The pointer lock for that CPU is acquired (notice that this is not in the global lock contention domain; it only involves the two CPUs in question), and the preassembled list is added to the list of the target CPU. In the normal course of events, this changes a value from NULL to a list of processes to move into the target scheduler queue; worst case, the CPU pushing has to assign the contents of a pointer to the pointer to the next entry in the last entry of the pushed list to the protected pointer, and the protected pointer t the head. In practice, the number of processes being migrated will never be more than 1, unless you write a replacement scheduler, which is migration happy. So the migrate away is one-behind the head of the scheduling queue, and the migrate-to is one ahead. On migration, an artificial inflation of the figure of merit can be handled -- or a the value of the pointer can be examined, and assumed to add a constant weighting to the target CPU's figure of merit, if the pointer value is non-NULL. Thus, it is only ever required to lock when you are actively doing process migration, and process migration is rare (as it should be, particularly if one of your target architectures is NUMA, but in the general case on small CPU count shared memory multiprocessors, as well). This also permits preference weighting based on locality, for affinity on hyper-threaded CPUs, and negaffinity in CPU sets under the same circumstances. > 3) two runqueue arrays (current and expired) instead of > just one, which enables ... Not required. > 4) ... event-driver priority recalculation, instead of > recalculating the priority of each task separately This actually doesn't work. The worst case failure is under overload, which is exactly where you don't want it to be. The scheduling for the BSD scheduler, as was pointed out, takes time not run into the priority weighting. A granularity of 3 seconds until the disctinction between the two queues for enqueueing delayed jobs is realized is really gross. 8-(. > These changes are probably small enough that they can be done > without the risk of destabilising anything. That's certainly true, regardless of the implementation. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 14:20:16 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3C3D037B401 for ; Fri, 20 Sep 2002 14:20:14 -0700 (PDT) Received: from sccrmhc03.attbi.com (sccrmhc03.attbi.com [204.127.202.63]) by mx1.FreeBSD.org (Postfix) with ESMTP id AEC3D43E42 for ; Fri, 20 Sep 2002 14:20:13 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc03.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020920212012.PCLG28420.sccrmhc03.attbi.com@InterJet.elischer.org>; Fri, 20 Sep 2002 21:20:12 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id OAA20892; Fri, 20 Sep 2002 14:12:17 -0700 (PDT) Date: Fri, 20 Sep 2002 14:12:16 -0700 (PDT) From: Julian Elischer To: Rik van Riel Cc: Bill Huey , freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Rik van Riel wrote: > On Fri, 20 Sep 2002, Julian Elischer wrote: > > > I didn't sat it's not possible, it's just that the interaction between > > threads and KSEs on the run queue is very complicated in the current > > "interim" scheduler (compatible with the old process scheduler but with > > a huge "tumor" on the side of it to do something with threads) > > > ie. You need to schedule threads in the kernel, while not allowing > > a process with a lot of threads to flood the system. > > Interesting problem, but that might be better done in a more > generic way. Ie. first build a thread scheduler and then add > support for generic resource containers that aren't tied to > the thread<->process relation. > > Once you have that, you could substitute thread<->user for the > default relationship and prevent users with many threads from > flooding the CPU ;) > > Adding resource containers to a scheduler can be hard though, I > still haven't found a pretty way of adding per-container (in my > case I want to start with per-user) CPU time accounting to Ingo's > O(1) scheduler. Sure, I've got several ugly ideas and one less > ugly idea, but I haven't found anything nice yet... Yes this is one of the possible solutions.. FreeBSD has not tackled the problem yet. All that has happenned is that I have refactored the proc struct in as flexible a way as I can (it's now 4 pieces) that can be reassembled in different combinations to produce every variant I have seen in the literature. glob them ALL together to get UNIXtreat proc and ksegrp as one, and kse and thread as one and get MACH threads. put proc and ksegrp together, and tread thread and kse as separate, and you get scheduler activations. put the thread and KSE and KSEGRP together vs the proc, and you have solaris LWP. What we now need is a "scheduler weenie" to start using these and playing with them to produce some good algorythms. What we have in place now is a 'simplest compatible scheduler' that handles UNIX processes as before, but can schedule threads independently. It's by NO MEANS optimal. > > regards, > > Rik > -- > Bravely reimplemented by the knights who say "NIH". > > http://www.surriel.com/ http://distro.conectiva.com/ > > Spamtraps of the month: september@surriel.com trac@trac.org > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 14:36:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A925837B401 for ; Fri, 20 Sep 2002 14:36:48 -0700 (PDT) Received: from gnuppy.monkey.org (wsip68-15-8-100.sd.sd.cox.net [68.15.8.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3CED443E42 for ; Fri, 20 Sep 2002 14:36:48 -0700 (PDT) (envelope-from billh@gnuppy.monkey.org) Received: from billh by gnuppy.monkey.org with local (Exim 3.36 #1 (Debian)) id 17sVRt-0000WP-00; Fri, 20 Sep 2002 14:36:29 -0700 Date: Fri, 20 Sep 2002 14:36:29 -0700 To: Terry Lambert Cc: Rik van Riel , Julian Elischer , freebsd-arch@freebsd.org Subject: Re: New Linux threading model Message-ID: <20020920213629.GA1527@gnuppy.monkey.org> References: <3D8B8E35.EDAF4450@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3D8B8E35.EDAF4450@mindspring.com> User-Agent: Mutt/1.4i From: Bill Huey (Hui) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, Sep 20, 2002 at 02:08:05PM -0700, Terry Lambert wrote: > > 1) per-cpu runqueues instead of a global one, which wants ... > > Yes. > > > 2) ... load balancer between these per-cpu queues > > No. > > This is the problem with the Linux model. With an active load > balancing algorithm, you end up with global contention for > scheduler queue locks. That's not true, they're all per CPU run queue locks. > In reality, you can get away from having *any* scheduler queue > locks *at all*, for the normal case, and then only contend at > the inter-CPU level for the CPUs using a push migration model. That's what the Mingo's scheduler does. It only does a double lock when the load balance computation makes it do a CPU migrate, otherwise there's no contention and no per CPU locks are aquired. > So the migrate away is one-behind the head of the scheduling queue, > and the migrate-to is one ahead. On migration, an artificial > inflation of the figure of merit can be handled -- or a the value > of the pointer can be examined, and assumed to add a constant > weighting to the target CPU's figure of merit, if the pointer value > is non-NULL. > > Thus, it is only ever required to lock when you are actively doing > process migration, and process migration is rare (as it should be, > particularly if one of your target architectures is NUMA, but in > the general case on small CPU count shared memory multiprocessors, > as well). > > This also permits preference weighting based on locality, for > affinity on hyper-threaded CPUs, and negaffinity in CPU sets under > the same circumstances. I don't normally ask this, but what the hell are you saying here ? > > 3) two runqueue arrays (current and expired) instead of > > just one, which enables ... > > Not required. That's part of mingo's algorithm to avoid recalculation if I understand it correctly. Not exactly sure, I'm stretching my knowledge of his algorithm here. > > 4) ... event-driver priority recalculation, instead of > > recalculating the priority of each task separately > > This actually doesn't work. The worst case failure is under > overload, which is exactly where you don't want it to be. What kind of overload ? he does a number of things to make sure that all processes behave properly by demoting priorities. > The scheduling for the BSD scheduler, as was pointed out, takes > time not run into the priority weighting. A granularity of 3 > seconds until the disctinction between the two queues for > enqueueing delayed jobs is realized is really gross. 8-(. bill To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 14:39:48 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AC6EC37B401 for ; Fri, 20 Sep 2002 14:39:46 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6F79643E65 for ; Fri, 20 Sep 2002 14:39:45 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:24296 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 18:39:30 -0300 Date: Fri, 20 Sep 2002 18:39:29 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Terry Lambert Cc: Julian Elischer , Bill Huey , Subject: Re: New Linux threading model In-Reply-To: <3D8B8E35.EDAF4450@mindspring.com> Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Terry Lambert wrote: > Rik van Riel wrote: > > 1) per-cpu runqueues instead of a global one, which wants ... > Yes. > > > 2) ... load balancer between these per-cpu queues > > No. > > This is the problem with the Linux model. With an active load > balancing algorithm, you end up with global contention for > scheduler queue locks. I'm not saying you should balance all the time. It's enough to load balance once every eternity, say 1/4 of a second. Maybe a bit more often if you've got a really idle CPU. > > 3) two runqueue arrays (current and expired) instead of > > just one, which enables ... > > Not required. > > > 4) ... event-driver priority recalculation, instead of > > recalculating the priority of each task separately > > This actually doesn't work. The worst case failure is under > overload, which is exactly where you don't want it to be. What do you mean it doesn't work ? This algorithm is being used in practice and it works just fine. > The scheduling for the BSD scheduler, as was pointed out, takes > time not run into the priority weighting. The Linux O(1) scheduler uses "time not on the run queue" to determine process priority, this automatically scales when the system gets busier and busier. > A granularity of 3 seconds until the disctinction between the two queues > for enqueueing delayed jobs is realized is really gross. 8-(. Yes, it's gross. However, if your system so heavily overloaded that the sum of all timeslices of runnable processes gets larger than 3 seconds there isn't much you can do about that. kind regards, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 17: 0:15 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 98CF837B406 for ; Fri, 20 Sep 2002 17:00:12 -0700 (PDT) Received: from rwcrmhc52.attbi.com (rwcrmhc52.attbi.com [216.148.227.88]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4687C43E65 for ; Fri, 20 Sep 2002 17:00:12 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc52.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020921000011.WYRL464.rwcrmhc52.attbi.com@InterJet.elischer.org>; Sat, 21 Sep 2002 00:00:11 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id QAA21633; Fri, 20 Sep 2002 16:55:09 -0700 (PDT) Date: Fri, 20 Sep 2002 16:55:08 -0700 (PDT) From: Julian Elischer To: Terry Lambert Cc: freebsd-arch@freebsd.org Subject: Re: New Linux threading model In-Reply-To: <3D8B8E35.EDAF4450@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Ok, Terry, I've thought of the best way that we can use your particular talents. How about this.. We have a particular set of scheduling requirements: 1/ We want threads to have as much parallelism as possible given the hardware 2/ We want a particular process to be able to contain 'subentities' that can be scheduled with different policies. (we call them ksegroups) 3/ We want simple processes to behave exactly as now, and new processes to compete with traditional processes on a basis SIMILAR to how traditional processs compete with each other. 4/ Partly a corrolary to 3: A threaded process can not overwhelm a system's scheduler with many threads. (we have a structure 'kse' that may come in handy for this, but if you thnk of a better way, consider it.. 5/ improve handling of large numbers of threads. 6/ If possible allow selection of scheduler algorythms at run time (*) (*) I said IF POSSIBLE.. Given these restraints, can you go through the literature, pick out relevent examples, and report back to us with scheduling schemes that may work for us. You are also welcome to make up your own schemes based on what you read or feel. I'm not saying this is how we'll go, just that it's come time that someone rescanned the literature again, and we could certainly do with a discussion on the topic. You may recruit as many "scheduler gurus" as you wish to help you. Of particular interest to you shuld be: 1/ READ THE CURRENT CODE (kern_switch.c and proc.h) 2/ The mach and chorus schedulers if you can find info on them 3/ SAs 4/ Linucks new scheduler. (READ THE CODE) 5/ the Solaris scheduler re: LWPs "Should you decide to accept this mission, the secretary will of course deny any knowledge of your actions. This email will self destruct in however long it takes you to hit the 'd' key." To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 17:42:58 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id AB74037B401 for ; Fri, 20 Sep 2002 17:42:51 -0700 (PDT) Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2B97F43E3B for ; Fri, 20 Sep 2002 17:42:51 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0066.cvx40-bradley.dialup.earthlink.net ([216.244.42.66] helo=mindspring.com) by avocet.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sYM0-0005bR-00; Fri, 20 Sep 2002 17:42:38 -0700 Message-ID: <3D8BBFFD.E1CDEAD5@mindspring.com> Date: Fri, 20 Sep 2002 17:40:29 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Bill Huey (Hui)" Cc: Rik van Riel , Julian Elischer , freebsd-arch@freebsd.org Subject: Re: New Linux threading model References: <3D8B8E35.EDAF4450@mindspring.com> <20020920213629.GA1527@gnuppy.monkey.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Bill Huey (Hui)" wrote: > > No. > > > > This is the problem with the Linux model. With an active load > > balancing algorithm, you end up with global contention for > > scheduler queue locks. > > That's not true, they're all per CPU run queue locks. The locks are in the global contention domain; basically, they have to be shared among all processors, unless they are in uncacheable pages. Any contended resources are going to be effectively global, at least if they are contended by multiple CPUs. > > In reality, you can get away from having *any* scheduler queue > > locks *at all*, for the normal case, and then only contend at > > the inter-CPU level for the CPUs using a push migration model. > > That's what the Mingo's scheduler does. It only does a double lock > when the load balance computation makes it do a CPU migrate, otherwise > there's no contention and no per CPU locks are aquired. This isn't true, if a double-lock is involved. The issue is one of whether or not you need to acquire a lock on your local scheduler queue in order to use it, or not. You're saying that you have to acquire the lock, which means that you effectively have to do it in a globally contended space. In the design I'm talking about, there are is only a single lock, and that lock is on a migration queue, into which you *push* entries. This is different from both the Linux scheduler, and the current FreeBSD scheduler, and the current FreeBSD scheduler with Alfred's old (last year) affinity patches applied. Bu only scheduling out of your own queue, and only pushing from your own queue to a per CPU migration queue, you guarantee *one* lock on the migration, and you guarantee not needing *any* locks for your own queue access. Even if you place the scheduler locks in uncontended pages, and eat a factor of 5 access penalty for the barrier, having to get your own scheduler lock -- or some other CPU's -- leaves you in the position of having to eat a huge amount of overhead in the common (non-migration) case. > I don't normally ask this, but what the hell are you saying here ? Here's pseudo-code: /* handle migration *to* us */ IF !empty(my_migration_queue) LOCK(my_migration_queue) while( !empty(my_migration_queue)) tmp = my_migration_queue my_migration_queue = tmp->next insert(my_scheduler_queue, tmp) /* recalculate my load figure of merit */ /* update our figure of merit */ UNLOCK(my_migration_queue) ENDIF /* get next scheduled process to run for us */ next_run = head(my_scheduler_queue) remove(my_scheduler_queue, next_run) /* * compare our load figure of merit vs. all other CPUs to decide * whether to migrate a process from us to another CPU */ /* ...find lowest figure of merit of all neighbors... */ /* ...compare our figure of merit minus a low watermark to that... */ IF should_migrate migrate = head(my_scheduler_queue) remove(my_scheduler_queue, migrate) LOCK(target_migration_queue) migrate->next = target_migration_queue target_migration_queue = migrate UNLOCK(target_migration_queue) /* update our figure of merit */ ENDIF /* ...run "next_run" ... */ ...the local CPU scheduler queue is not a contended resource. Ever. Further, you are permitted any amount of complexity in your target choice you want. This includes preferring "hyperthreaded" CPUs on the same die for affinity's sake (to avoid cache shootdown), and preferring other physical CPUs for negaffinity's sake (to avoid stall barriers that effect more than one thread simultaneously, as would be the case on a hyperthreaded CPU running a process with two threads in a box with 2 real CPUs with 2 hyperthreded "CPUs" each). > > > 3) two runqueue arrays (current and expired) instead of > > > just one, which enables ... > > > > Not required. > > That's part of mingo's algorithm to avoid recalculation if I understand > it correctly. Not exactly sure, I'm stretching my knowledge of his > algorithm here. Yes, it's his recalculation avoidance, in order to make the scheduler "O(1)". The real effect, though, is to cause starvation to be possible under heavy load, at which point things which would have been scheduled preferentially are scheduled round-robin in the other queue. That this only triggers when the other queue head (what I would call "the full quantum queue", as opposed to "the partial quantum queue") is waiting "too long" is actually part of the problem, as I see things. > > > 4) ... event-driver priority recalculation, instead of > > > recalculating the priority of each task separately > > > > This actually doesn't work. The worst case failure is under > > overload, which is exactly where you don't want it to be. > > What kind of overload ? he does a number of things to make sure that > all processes behave properly by demoting priorities. And by requeueing them into the full quantum queue, rather than the "partial quantum queue", if the number of things which need to run end up exceeding an arbitrary pool retention time on the "full quantum queue". This basically results in mathematical artifacts, from treating a step function as if it were, in the limit, equivalent to a function which was continuous. 8-). The main thing this will do is, when the load gets high enough that this happens every time, the system will degrade to the same behaviour as if a "partial quantum queue" didn't exist. Or rather, that behaviour, plus an additional lock, plus an empty queue examination, plus an unlock. The other thing it will do is cause the artifacts, which will result in unexpected behaviour near the boundary cases. The worst case for this is ping-ponging between the queues for an interactive process, e.g. a game or an MP3 player or an auidio encoding process or a CD or DVD burner, etc., while it's sitting right at the boundary. This will not hurt it on average, but it will toggle between getting full vs. partial quantums (for example), which could make things, uh... "bursty". ;^). This degraded behaviour will be exactly equal to a complete lack of thread group affinity. Actually, the "partial quantum" case can only ever be 66% effective in any event, as a statistical probability of the best case that still exhibits contention: it's intent is really to ensure preferential scheduling of processes which are ready-to-run, and which have partial quantum remaining. This actually fails, when the processes have partial quantum remaining... but aren't ready to run. The assumption implicit in this is that the average blocking operation will take no more than 50% of the duration of a quantum in order to complete in the background. Anything longer than that, and the scheduling degrades, again, to round-robin. This is actually the primary reasoning behind my design of a test case for a benchmark, vs. a microbenchmark: the algorithm that's there will work *very well* for an uncontended system running a single multithreaded process, but degrade significantly under all other situations (IMO). I'm pretty sure that Peter and Matt will not let FreeBSD blindly implement the same model, without at least running the numbers on it, or the model I've suggested, or *any* model, for that matter. FreeBSD is pretty technically conservative, which, at times like this, is usually a good thing. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 17:54:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CF85437B401 for ; Fri, 20 Sep 2002 17:54:37 -0700 (PDT) Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3FC8843E65 for ; Fri, 20 Sep 2002 17:54:37 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0066.cvx40-bradley.dialup.earthlink.net ([216.244.42.66] helo=mindspring.com) by avocet.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17sYXT-0003qs-00; Fri, 20 Sep 2002 17:54:28 -0700 Message-ID: <3D8BC2E5.62B153E1@mindspring.com> Date: Fri, 20 Sep 2002 17:52:53 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Rik van Riel Cc: Julian Elischer , Bill Huey , freebsd-arch@freebsd.org Subject: Re: New Linux threading model References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Rik van Riel wrote: > > > 4) ... event-driver priority recalculation, instead of > > > recalculating the priority of each task separately > > > > This actually doesn't work. The worst case failure is under > > overload, which is exactly where you don't want it to be. > > What do you mean it doesn't work ? This algorithm is > being used in practice and it works just fine. I've gone into detail in another posting; this is really a *FreeBSD* architectural list, however. By "doesn't work", I mean "fails to yield the predicted results". If you need a graphic example, please write a Linux version of the multiple contending processes benchmark I proposed, rather than a single microbenchmark. It should be obvious, when the incremental value of additional CPU drops from ~75% to ~50%, and then when the load gets high enough to trigger the requeueing onto the "full quantum queue" instead of "the partial quantum queue", it drops to ~25%. > > The scheduling for the BSD scheduler, as was pointed out, takes > > time not run into the priority weighting. > > The Linux O(1) scheduler uses "time not on the run queue" to > determine process priority, this automatically scales when the > system gets busier and busier. So insertion into the scheduling queue is an average of O(N/2)? ;^). > > A granularity of 3 seconds until the disctinction between the two queues > > for enqueueing delayed jobs is realized is really gross. 8-(. > > Yes, it's gross. However, if your system so heavily overloaded > that the sum of all timeslices of runnable processes gets larger > than 3 seconds there isn't much you can do about that. > > kind regards, The point is that the degraded case when that happens gains no thread group affinity benefits, and you basically end up paying TLB shootdown overhead statistically, based on your processes fraction of the total number of threads on the CPU in question. This same degradation will occur if most of your threads are doing disk I/O (e.g. NFS server, with a lot of clients) or network I/O over a loaded or high latency link, or durin a DOS attack. It will happen at precisely the worst time for it to happen. Make a Linux version of the suggested threads benchmark, and run it on old vs. new Linux threads, all other things being equal. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 18:59:30 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9592937B401 for ; Fri, 20 Sep 2002 18:59:29 -0700 (PDT) Received: from 2-225.ctame701-1.telepar.net.br (2-225.ctame701-1.telepar.net.br [200.193.160.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id E37CA43E75 for ; Fri, 20 Sep 2002 18:59:27 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from localhost ([IPv6:::ffff:127.0.0.1]:37036 "EHLO localhost") by imladris.surriel.com with ESMTP id ; Fri, 20 Sep 2002 22:59:05 -0300 Date: Fri, 20 Sep 2002 22:59:03 -0300 (BRT) From: Rik van Riel X-X-Sender: riel@imladris.surriel.com To: Terry Lambert Cc: "Bill Huey (Hui)" , Julian Elischer , Subject: Re: New Linux threading model In-Reply-To: <3D8BBFFD.E1CDEAD5@mindspring.com> Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Terry Lambert wrote: > In the design I'm talking about, there are is only a single lock, > and that lock is on a migration queue, into which you *push* > entries. This is different from both the Linux scheduler, and the > current FreeBSD scheduler, and the current FreeBSD scheduler with > Alfred's old (last year) affinity patches applied. Very nice design, IF the cost of contention is higher than the cost of having CPUs sit idle until the next rescheduling happens on one of the busier CPUs... Say that you have a moderately loaded box with, on average, about as many runnable tasks as you have CPUs. These tasks will, due to statistics, not always be evenly distributed across CPUs. Say that distribution is near-perfect and this results in only 1% CPU idle time. Your scheme would win ONLY if the lock contention in the pull model is responsible for more than that 1% CPU time. If we can keep the CPUs busier by pulling tasks onto idle CPUs and the locking overhead is less than 1%, it's a win. regards, Rik -- Bravely reimplemented by the knights who say "NIH". http://www.surriel.com/ http://distro.conectiva.com/ Spamtraps of the month: september@surriel.com trac@trac.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 23:25:15 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6BAA837B401 for ; Fri, 20 Sep 2002 23:25:13 -0700 (PDT) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 252D943E42 for ; Fri, 20 Sep 2002 23:25:13 -0700 (PDT) (envelope-from baka@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1921) id F00ADAE163; Fri, 20 Sep 2002 23:25:12 -0700 (PDT) Date: Fri, 20 Sep 2002 23:25:12 -0700 From: Jon Mini To: Julian Elischer Cc: Terry Lambert , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model Message-ID: <20020921062512.GB24394@elvis.mu.org> References: <3D8B8E35.EDAF4450@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Terry, I would also very much like to hear your thoughts on the best possible threading system for FreeBSD. I have read several of your messages on the subject, and I have somewhat of an idea of the kind of system you'd like to see us write, but a clear picture of the overall design is lacking. Please, write up a description of what you'd like to see. I'll ask questions until I think I've got it and then paraphrase the whole thing back at you, and we can attack if from the other direction (you can correct where I'm wrong). Deal? =) Julian Elischer [julian@elischer.org] wrote : > Ok, Terry, > > I've thought of the best way that we can use your particular talents. > > How about this.. > We have a particular set of scheduling requirements: > 1/ We want threads to have as much parallelism as possible given the > hardware > 2/ We want a particular process to be able to contain 'subentities' > that can be scheduled with different policies. (we call them ksegroups) > 3/ We want simple processes to behave exactly as now, and new processes > to compete with traditional processes on a basis SIMILAR to how > traditional processs compete with each other. > 4/ Partly a corrolary to 3: A threaded process can not overwhelm a > system's scheduler with many threads. (we have a structure 'kse' that > may come in handy for this, but if you thnk of a better way, > consider it.. > 5/ improve handling of large numbers of threads. > 6/ If possible allow selection of scheduler algorythms at run time (*) > > (*) I said IF POSSIBLE.. > > Given these restraints, can you go through the literature, > pick out relevent examples, > and report back to us with scheduling schemes that may work for us. > You are also welcome to make up your own schemes based on what you read > or feel. > > I'm not saying this is how we'll go, just that it's come time that > someone rescanned the literature again, and we could certainly do with > a discussion on the topic. You may recruit as many "scheduler gurus" > as you wish to help you. > > Of particular interest to you shuld be: > 1/ READ THE CURRENT CODE (kern_switch.c and proc.h) > 2/ The mach and chorus schedulers if you can find info on them > 3/ SAs > 4/ Linucks new scheduler. (READ THE CODE) > 5/ the Solaris scheduler re: LWPs > > > > > > "Should you decide to accept this mission, the secretary will of course > deny any knowledge of your actions. This email will self destruct in > however long it takes you to hit the 'd' key." > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message -- Jonathan Mini http://www.freebsd.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Sep 20 23:53:19 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 588A237B401 for ; Fri, 20 Sep 2002 23:53:14 -0700 (PDT) Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8231D43E7B for ; Fri, 20 Sep 2002 23:53:13 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from pool0018.cvx40-bradley.dialup.earthlink.net ([216.244.42.18] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 17se8U-0006Ug-00; Fri, 20 Sep 2002 23:53:03 -0700 Message-ID: <3D8C170E.A84E922B@mindspring.com> Date: Fri, 20 Sep 2002 23:51:58 -0700 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Rik van Riel Cc: "Bill Huey (Hui)" , Julian Elischer , freebsd-arch@freebsd.org Subject: Re: New Linux threading model References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Rik van Riel wrote: > On Fri, 20 Sep 2002, Terry Lambert wrote: > > In the design I'm talking about, there are is only a single lock, > > and that lock is on a migration queue, into which you *push* > > entries. This is different from both the Linux scheduler, and the > > current FreeBSD scheduler, and the current FreeBSD scheduler with > > Alfred's old (last year) affinity patches applied. > > Very nice design, IF the cost of contention is higher than > the cost of having CPUs sit idle until the next rescheduling > happens on one of the busier CPUs... > > Say that you have a moderately loaded box with, on average, > about as many runnable tasks as you have CPUs. These tasks > will, due to statistics, not always be evenly distributed > across CPUs. This will never happen in the real world, where the steady state of system load (as a measure of the number of processes, on average, in the ready-to-run state) is > 1. 8-). It also makes a really serious assumption about the number of CPUs, or a seriously invalid one about what the average load of a desktop or internet server sits at. But say we accept that, for the sake of argument, even though we know that threads will drastically inflate the number of things in "ready to run" state, even if we decide to call those things something other than "processes". > Say that distribution is near-perfect and this results in > only 1% CPU idle time. > > Your scheme would win ONLY if the lock contention in the > pull model is responsible for more than that 1% CPU time. Not true. There is a common myth in the x86 world: it's that shared memory multiprocessors based on the x86 architecture stop scaling after 4 CPUs. In point of fact, Sequent built 32 processor 486, and later, Pentium systems, which were shared memory multiprocessors. They scaled significantly beyond that point, no problem. I use to program applications on these beasts, and the 68K based ones as well (i.e. both Sequent Symmetry and Sequent Balance machines). So what's the barrier in the SVR4 and Solaris -- and Linux, and FreeBSD -- OS architectures that makes everyone believe and propagate this myth? It all comes down to contention for shared resources. And these days, that boils down to, in increasing order of contention: o TLB/L1 cache coherency o L2 cache memory o System memory o The I/O bus(ses) ...it's really, really important to remember that the "I" in "MESI" stands for "Invalidate". 8-). > If we can keep the CPUs busier by pulling tasks onto idle > CPUs and the locking overhead is less than 1%, it's a win. With respect, most modern systems are never CPU-bound these days; when they *are* CPU bound, it's because there is a stall barrier that introduced by an I/O or main memory access. Consider a very fast Intel system these days; it's common to find machines that run at ~2.1GHz. The fastest front-side memory bus in common use is 433MHz -- one fifth of that speed. Now that a simple clock -- a compare and exchange; you are talking a factor of either 10 or 15 on the access, vs. the instructions used in the access, because there is a barrier that requires that the MESI cache coherency is mantained on the lock contents: 5 in, 5, out, and (maybe) another 5 in. *Maybe*, if you limited yourself to non-shared-memory systems, you can get the speed win that you wanted; that basically means "no hyperthreading", or it means "NUMA". Now let's "solve" the general SMP shared memory multiprocessor scaling problem, in its entirety. The normal way to solve this is: don't share the memory. A difference way of putting this is "A resource which is not shared, is a resource which is not contended". How do you achieve this virtual state, on real hardware, when the real hardware *does* share resources? The easy way to do this is to break up ownership of the system resources so that the contention is minimized. This means: o Eliminate locking from common non-failure code paths. o Eliminate resource contention by assigning resources to per processor pools, which can be refilled from, and flushed to, contended system-wide pools -- *only* when necessary. o Eliminate locking from failure code paths o Eliminate cache discard (TLB shootdown, L1 and L2 cache invalidation, reloading of data from devices, if the data is already in memory) o Assign data channels to specific CPUs, rather than running in virtual wire mode for interrupt processing, or transiently assigning interrupts to a single CPU simply because it has acquired a global lock o Enforce cache locality (common code path coherency, cache coloring, etc.; Ingo Molnar, one of the authors of the paper cited earlier, has done this for the Tux in-kernel HTTP server, so that the common code path all fits in cache: TCP stack, web server, and all) ...in short, pretty much everything Sequent did to Dynix, reaching a peak between 1992 and 1994 or so, and what SGI has been arguing for inclusion in Linux, based on the patch sets they've provided. It's all fine and good to argue CPU time, when there isn't a clock multiplier, or when your CPU time is your highest contended resource. But this is only true, even on CPU intensive applications, if those applications are inherently parallelizable -- the kind of applications where you can hook a bunch of toy computers together, and get the same results that you would get from a real supercomputer. Even then, it only really works if there is isolated locality of reference for the problem subsets between the CPUs: if everyone is writing data into the same page in memory, then it's going to be a contended resource, and it's going to bottleneck the computations in contention for the resource. In the common case, main memory is generally a full order of magnitude slower than L1 cache. And disk memory is a full order of magnitude slower than that. And for any network application, any disk I/O you do is going to contend with the network for I/O bus time. So, back on the "Subject:" line... The benefits of seperating the scheduler queues, without the introduction of locks, except in the exceptional code paths, and delaying that, if possible, to act as a smoothing function for transient load spikes (e.g. make the figure of merit for CPU load be a weighted moving average of the last N snapshots of the instantaneous load) ...is not the be-all, end-all of optimizations. On the other hand, it doesn't suck, and it's not something you can so easily dismiss by claiming that the CPU load is "only" 99%. In other words, we are preventing objects in one ownership domain from moving to another one, except under extraoridnary circumstances, because we know that that's where contention arises. Another part of the answer is going to a memory allocator that use per CPU pools of memory, which are allocated only rarely from the system common pool, and which only rarely span the same locality as that of another CPU... so that locks are not needed, TLB shootdown doesn't occur, and the L2 and L1 cache contents are rarely invalidated by the MMU, simply because they have been accessed by a different CPU. Again: making changes in ownership extrordinary. Yeah, there's some obvious optimizations we can get on this, too. One is to allocate the memory and other resources, as necessary, to each CPU (real or "hyperreal"), and don't give it back when the free list hits a high watermark. Instead, leave it there, until the system pool hits a low watermark, and then signal the CPUs that there is an insufficiency. This basically eliminates all the thrashing until there is a shortage -- if there ever is -- of a given contended resource. If you want to argue this, the thing to do is to write the code to ensure tunable contention levels, and to gather statistics on the real-world performance of the algorithm *under contention*; because the chance for memory contention are greater than 5 times the CPU load contention -- the average over the last 10 years has been pretty steady at 10 times, actually, and has fluctuated as high as 20 times. And the memory bus isn't the slowest point in data flow through a computer, it's I/O. So force the contention issue, and measure how the algorithm you are defending degrades. If you want to look at L2 contention directly, then a post-processor that counts the use of the LCK prefix would be useful -- it's most useful, if all your mutex code gets inlined, so you can count real occurances, rather than referenced occurrances. In most cases, if you are grabbing a lock, it's because you have failed to nail the problem, or you are working around having to rewrite all of the code. Neither one of these is actually damning, but it's good to know when you're doing it. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Sep 21 2: 0:17 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2991237B401; Sat, 21 Sep 2002 02:00:15 -0700 (PDT) Received: from sccrmhc01.attbi.com (sccrmhc01.attbi.com [204.127.202.61]) by mx1.FreeBSD.org (Postfix) with ESMTP id 719C843E42; Sat, 21 Sep 2002 02:00:14 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org ([12.232.206.8]) by sccrmhc01.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020921090013.CLQU8451.sccrmhc01.attbi.com@InterJet.elischer.org>; Sat, 21 Sep 2002 09:00:13 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id BAA23690; Sat, 21 Sep 2002 01:58:53 -0700 (PDT) Date: Sat, 21 Sep 2002 01:58:52 -0700 (PDT) From: Julian Elischer To: Jon Mini Cc: Terry Lambert , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model In-Reply-To: <20020921062512.GB24394@elvis.mu.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 20 Sep 2002, Jon Mini wrote: > Terry, > > I would also very much like to hear your thoughts on the best > possible threading system for FreeBSD. I have read several of your > messages on the subject, and I have somewhat of an idea of the kind > of system you'd like to see us write, but a clear picture of the > overall design is lacking. > Please, write up a description of what you'd like to see. I'll ask > questions until I think I've got it and then paraphrase the whole > thing back at you, and we can attack if from the other direction > (you can correct where I'm wrong). > > Deal? =) This doen't mean that "what terry says goes" but he's being complainign about the scheduler for so long that we might as well have him start the ball rolling on discussions on the topic :-) > > Julian Elischer [julian@elischer.org] wrote : > > > Ok, Terry, > > > > I've thought of the best way that we can use your particular talents. > > > > How about this.. > > We have a particular set of scheduling requirements: > > 1/ We want threads to have as much parallelism as possible given the > > hardware > > 2/ We want a particular process to be able to contain 'subentities' > > that can be scheduled with different policies. (we call them ksegroups) > > 3/ We want simple processes to behave exactly as now, and new processes > > to compete with traditional processes on a basis SIMILAR to how > > traditional processs compete with each other. > > 4/ Partly a corrolary to 3: A threaded process can not overwhelm a > > system's scheduler with many threads. (we have a structure 'kse' that > > may come in handy for this, but if you thnk of a better way, > > consider it.. > > 5/ improve handling of large numbers of threads. > > 6/ If possible allow selection of scheduler algorythms at run time (*) > > > > (*) I said IF POSSIBLE.. > > > > Given these restraints, can you go through the literature, > > pick out relevent examples, > > and report back to us with scheduling schemes that may work for us. > > You are also welcome to make up your own schemes based on what you read > > or feel. > > > > I'm not saying this is how we'll go, just that it's come time that > > someone rescanned the literature again, and we could certainly do with > > a discussion on the topic. You may recruit as many "scheduler gurus" > > as you wish to help you. > > > > Of particular interest to you shuld be: > > 1/ READ THE CURRENT CODE (kern_switch.c and proc.h) > > 2/ The mach and chorus schedulers if you can find info on them > > 3/ SAs > > 4/ Linucks new scheduler. (READ THE CODE) > > 5/ the Solaris scheduler re: LWPs > > > > > > > > > > > > "Should you decide to accept this mission, the secretary will of course > > deny any knowledge of your actions. This email will self destruct in > > however long it takes you to hit the 'd' key." > > > > > > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > > with "unsubscribe freebsd-arch" in the body of the message > > -- > Jonathan Mini > http://www.freebsd.org/ > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Sep 21 12: 1:53 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8BA9B37B40C for ; Sat, 21 Sep 2002 12:01:52 -0700 (PDT) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 514EB43E77 for ; Sat, 21 Sep 2002 12:01:52 -0700 (PDT) (envelope-from baka@elvis.mu.org) Received: by elvis.mu.org (Postfix, from userid 1921) id 14FAFAE22C; Sat, 21 Sep 2002 12:01:52 -0700 (PDT) Date: Sat, 21 Sep 2002 12:01:52 -0700 From: Jon Mini To: Julian Elischer Cc: Terry Lambert , freebsd-arch@FreeBSD.ORG Subject: Re: New Linux threading model Message-ID: <20020921190151.GA46099@elvis.mu.org> References: <20020921062512.GB24394@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Julian Elischer [julian@elischer.org] wrote : > This doen't mean that "what terry says goes" > > but he's being complainign about the scheduler for so long > that we might as well have him start the ball rolling on discussions > on the topic :-) Well, of course. =) -- Jonathan Mini http://www.freebsd.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Sep 21 19:10:44 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.FreeBSD.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 13EB637B401 for ; Sat, 21 Sep 2002 19:10:43 -0700 (PDT) Received: from web13404.mail.yahoo.com (web13404.mail.yahoo.com [216.136.175.62]) by mx1.FreeBSD.org (Postfix) with SMTP id CFF9C43E75 for ; Sat, 21 Sep 2002 19:10:42 -0700 (PDT) (envelope-from giffunip@yahoo.com) Message-ID: <20020922021042.73571.qmail@web13404.mail.yahoo.com> Received: from [216.72.7.24] by web13404.mail.yahoo.com via HTTP; Sun, 22 Sep 2002 04:10:42 CEST Date: Sun, 22 Sep 2002 04:10:42 +0200 (CEST) From: "=?iso-8859-1?q?Pedro=20F.=20Giffuni?=" Subject: Loadable Scheduler (was Re: New Linux threading model) To: freebsd-arch@FreeBSD.org MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I know everyone is pretty much busy on other things, but since the subject of schedulers keeps coming every once in a while, I thought I should share this document from Microsoft Research that I found while looking for information on scheduler activations: http://www.research.microsoft.com/scripts/pubs/view.asp?TR_ID=MSR-TR-98-30 Vassal: Loadable Scheduler Support for Multi-Policy Scheduling" _________ This paper presents Vassal, a system that enables applications to dynamically load and unload CPU scheduling policies into the operating system kernel, allowing multiple policies to be in effect simultaneously. With Vassal, applications can utilize scheduling algorithms tailored to their specific needs and general-purpose operating systems can support a wide variety of special-purpose scheduling policies without implementing each of them as a permanent feature of the operating system. We implemented Vassal in the Windows NT 4.0 kernel. Loaded schedulers coexist with the standard Windows NT scheduler, allowing most applications to continue being scheduled as before, even while specialized scheduling is employed for applications that request it. A loaded scheduler can dynamically choose to schedule threads in its class, or can delegate their scheduling to the native scheduler, exercising as much or as little control as needed. Thus, loaded schedulers can provide scheduling facilities and behaviors not otherwise available. Our initial prototype implementation of Vassal supports two concurrent scheduling policies: a single loaded scheduler and the native scheduler. The changes we made to Windows NT were minimal and they have essentially no impact on system behavior when loadable schedulers are not in use. Furthermore, loaded schedulers operate with essentially the same efficiency as the default scheduler. An added benefit of loadable schedulers is that they enable rapid prototyping of new scheduling algorithms by often removing the time-consuming reboot step from the traditional edit/compile/reboot/debug cycle. In addition to the Vassal infrastructure, we also describe a “proof of concept” loadable real-time scheduler and performance results. Published by Advanced Computing Systems Association as: Candea, G.M. and M.B. Jones, "Vassal: loadable scheduler support for multi-policy scheduling," Proceedings of the 2nd USENIX Windows NT Symposium, USENIX Assoc, Berkeley, CA, 1998, pp. 157-166. ___________ ps. I understand the source code (for Windows NT) is available by contacting the authors. ______________________________________________________________________ Mio Yahoo!: personalizza Yahoo! come piace a te http://it.yahoo.com/mail_it/foot/?http://it.my.yahoo.com/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message