From owner-freebsd-arch Mon Mar 24 4:59:41 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 06A6137B401 for ; Mon, 24 Mar 2003 04:59:38 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 311B643FB1 for ; Mon, 24 Mar 2003 04:59:37 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2OCxZhV005239 for ; Mon, 24 Mar 2003 13:59:35 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: arch@freebsd.org Subject: moving GEOM around... From: Poul-Henning Kamp Date: Mon, 24 Mar 2003 13:59:35 +0100 Message-ID: <5238.1048510775@critter.freebsd.dk> X-Spam-Status: No, hits=0.0 required=5.0 tests=none version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG A number of people have suggested that the directory layout of GEOM sources should be changed. The main complaint seems to be that sys/geom contains both subdirectories (bde) and source files. I personally don't particularly care about that, and as a matter of fact wasn't even aware that was a rule, but if a significant number of people think this is wrong I'm willing to repo-copy things around and fix it, therefore this strawpoll: Option 1: No change Option 2: sys/ geom/ infra/ geom_io.c geom_event.c ... bsd/ geom_bsd.c mbr/ geom_mbr.c sunlabel/ geom_sunlabel.c gbde/ g_bde.c g_bde_crypt.c ... ... Option 3: sys/ geom/ infra/ geom_io.c geom_event.c ... class/ # contains methods implemented in a single # source file geom_bsd.c geom_mbr.c geom_sunlabel.c ... gbde/ # classes implemented in multiple source # files get a subdirectory of their own. Straw votes in private email please... I'll draw whatever concensus opinion I can from the emails I get, and then I'll send the proposal to cvs@ who may at that time shoot it down as unnecessary repo-bloat. Poul-Henning PS: I'm not inclined to entertain a long bikeshed on this issue, on the more general topic of source tree re-layout and the need for a democratic process for determing the location of all future files in our cvs repo. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 5:10:11 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 51DAA37B401; Mon, 24 Mar 2003 05:10:09 -0800 (PST) Received: from k6.locore.ca (k6.locore.ca [198.96.117.170]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2952943F75; Mon, 24 Mar 2003 05:10:08 -0800 (PST) (envelope-from jake@k6.locore.ca) Received: from k6.locore.ca (localhost.locore.ca [127.0.0.1]) by k6.locore.ca (8.12.8/8.12.8) with ESMTP id h2ODFcxS078938; Mon, 24 Mar 2003 08:15:38 -0500 (EST) (envelope-from jake@k6.locore.ca) Received: (from jake@localhost) by k6.locore.ca (8.12.8/8.12.8/Submit) id h2ODFcnb078937; Mon, 24 Mar 2003 08:15:38 -0500 (EST) Date: Mon, 24 Mar 2003 08:15:38 -0500 From: Jake Burkholder To: John Baldwin Cc: arch@FreeBSD.ORG Subject: Re: Convert process at_foo events to eventhandlers Message-ID: <20030324081538.Y76446@locore.ca> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from jhb@FreeBSD.ORG on Fri, Mar 21, 2003 at 03:19:29PM -0500 X-Spam-Status: No, hits=-29.0 required=5.0 tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Apparently, On Fri, Mar 21, 2003 at 03:19:29PM -0500, John Baldwin said words to the effect of; > I'd like to convert the process at_fork, at_exec, and at_exit > events to be regular eventhandlers instead. This way I get to > leverage the locking of the existing eventhandlers w/o having > to duplicate it in three other places. The patch to do this is > at http://www.FreeBSD.org/~jhb/patches/proc_event.patch. > Note that the old API (at_foo, rm_at_foo) has been removed as > I can not easily implement the rm_at_foo functionality using > eventhandlers since eventhandlers allow for multiple instances > of a function in a list and use cookies instead of using the > function pointer directly to remove events. There is precedent > for this in that at_shutdown() also died when at_shutdown() was > converted to an eventhandler. This patch also defines some > generic eventhandler priorities so that users of eventhandlers > don't always have to define new constants for priorities. > > Comments? Do it! Jake To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 7:22:32 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4BFD637B401 for ; Mon, 24 Mar 2003 07:22:27 -0800 (PST) Received: from mail.nsu.ru (mx.nsu.ru [193.124.215.71]) by mx1.FreeBSD.org (Postfix) with ESMTP id D6AA343F3F for ; Mon, 24 Mar 2003 07:22:25 -0800 (PST) (envelope-from fjoe@iclub.nsu.ru) Received: from drweb by mail.nsu.ru with drweb-scanned (Exim 3.20 #1) id 18xTmH-0006BX-00 for arch@freebsd.org; Mon, 24 Mar 2003 21:22:21 +0600 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mail.nsu.ru with esmtp (Exim 3.20 #1) id 18xTmG-0006BH-00 for arch@freebsd.org; Mon, 24 Mar 2003 21:22:20 +0600 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.8/8.12.8) with ESMTP id h2OFM8j1094599 for ; Mon, 24 Mar 2003 21:22:08 +0600 (NS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.8/8.12.8/Submit) id h2OFM7MZ094596 for arch@freebsd.org; Mon, 24 Mar 2003 21:22:07 +0600 (NS) Date: Mon, 24 Mar 2003 21:22:06 +0600 From: Max Khon To: arch@freebsd.org Subject: [fjoe@iclub.nsu.ru: Re: thread-safe realpath] Message-ID: <20030324212205.A94544@iclub.nsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i X-Envelope-To: arch@freebsd.org X-Spam-Status: No, hits=-16.1 required=5.0 tests=EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG hi, there! I am forwarding this e-mail to arch@ because standards@ has been silent on this. ----- Forwarded message from Max Khon ----- Date: Fri, 21 Mar 2003 04:32:23 +0600 From: Max Khon To: standards@freebsd.org Subject: Re: thread-safe realpath hi, there! On Fri, Mar 21, 2003 at 03:38:21AM +0600, Max Khon wrote: > Constantin Svintsoff has submitted thread-safe realpath() implementation > (implementation that does not use chdir(2)). > The implementation is feature-compatible with FreeBSD implementation, i.e. > if the last component of specified path can't be stat'ed and there is no > trailing slash, realpath succeeds. > > I fixed a couple of bugs in it and would like to commit it to HEAD > if there will be no objections. > > Test program is attached. The test simply creates two threads and calls > realpath() in each. If the test is compiled with truepath() #if-0'ed > one of the assertions fail after some time (you may need to increase > number of iterations if you have very fast machine, mine is Athlon 850). > > Any comments are highly appreciated. > Please reply directly (I am not subscribed). I have also included realpath test from glibc 2.2.2. Tarball can be found here: http://people.freebsd.org/~fjoe/realpath.tar.gz /fjoe ----- End forwarded message ----- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 7:46:28 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 89F4E37B401 for ; Mon, 24 Mar 2003 07:46:25 -0800 (PST) Received: from numeri.campus.luth.se (numeri.campus.luth.se [130.240.197.103]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7774F43F3F for ; Mon, 24 Mar 2003 07:46:24 -0800 (PST) (envelope-from k@numeri.campus.luth.se) Received: from numeri.campus.luth.se (localhost [127.0.0.1]) by numeri.campus.luth.se (8.12.8/8.12.7) with ESMTP id h2OFkMho052694; Mon, 24 Mar 2003 16:46:22 +0100 (CET) (envelope-from k@numeri.campus.luth.se) Received: (from k@localhost) by numeri.campus.luth.se (8.12.8/8.12.7/Submit) id h2OFkMkN052669; Mon, 24 Mar 2003 16:46:22 +0100 (CET) Date: Mon, 24 Mar 2003 16:46:21 +0100 From: Johan Karlsson To: Max Khon Cc: arch@freebsd.org, Sheldon Hearn Subject: Re: [fjoe@iclub.nsu.ru: Re: thread-safe realpath] Message-ID: <20030324154621.GA82437@numeri.campus.luth.se> References: <20030324212205.A94544@iclub.nsu.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030324212205.A94544@iclub.nsu.ru> User-Agent: Mutt/1.4i X-Spam-Status: No, hits=-33.1 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Mar 24, 2003 at 21:22 (+0600) +0000, Max Khon wrote: > On Fri, Mar 21, 2003 at 03:38:21AM +0600, Max Khon wrote: > > > Constantin Svintsoff has submitted thread-safe realpath() implementation > > (implementation that does not use chdir(2)). > > The implementation is feature-compatible with FreeBSD implementation, i.e. > > if the last component of specified path can't be stat'ed and there is no > > trailing slash, realpath succeeds. > > > > I fixed a couple of bugs in it and would like to commit it to HEAD > > if there will be no objections. > > > > Test program is attached. The test simply creates two threads and calls > > realpath() in each. If the test is compiled with truepath() #if-0'ed > > one of the assertions fail after some time (you may need to increase > > number of iterations if you have very fast machine, mine is Athlon 850). > > > > Any comments are highly appreciated. How does this affect PR 12244? I've sent a patch to -audit for review a month ago and I'm about to commit that (just doing a final make universe). http://docs.freebsd.org/cgi/getmsg.cgi?fetch=0+0+archive/2003/freebsd-audit/20030209.freebsd-audit /Johan K > > Please reply directly (I am not subscribed). > > I have also included realpath test from glibc 2.2.2. > Tarball can be found here: > > http://people.freebsd.org/~fjoe/realpath.tar.gz > > /fjoe > > ----- End forwarded message ----- > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message > -- Johan Karlsson mailto:k@numeri.campus.luth.se To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 8:24: 0 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BBB4A37B401 for ; Mon, 24 Mar 2003 08:23:54 -0800 (PST) Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1799543FAF for ; Mon, 24 Mar 2003 08:23:52 -0800 (PST) (envelope-from wes@softweyr.com) Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id A737A42F96; Mon, 24 Mar 2003 08:23:49 -0800 (PST) From: Wes Peters Organization: Softweyr To: freebsd-arch@freebsd.org Subject: Patch to protect process from pageout killing Date: Mon, 24 Mar 2003 08:23:48 -0800 User-Agent: KMail/1.5 MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303240823.48262.wes@softweyr.com> X-Spam-Status: No, hits=-6.0 required=5.0 tests=PATCH_UNIFIED_DIFF,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG As promised, here's the patch to protect a process from being killed when pageout is in memory shortage. This allows a process to specify that it is important enough to be skipped when pageout is looking for the largest process to kill. My needs are simple. We make a box that is a web proxy and runs from a memory disk, using flash for permanent storage. The flash is mounted only when a configuration write is needed, the box runs from the memory disk. We've experienced a problem at certain customer sites where bind will consume a lot (~30 MB) of ram and then pageout will kill the largest process, which is usually either named or squid. This pretty much kills the box. We'd much rather have pageout kill off some of the squid worker processes, we can recover from that. Is this a good approach to the problem? Feedback welcome. --- kern/kern_resource.c.orig Sun Mar 23 22:12:55 2003 +++ kern/kern_resource.c Sun Mar 23 22:14:17 2003 @@ -562,12 +562,12 @@ } switch (which) { - case RLIMIT_CPU: mtx_lock_spin(&sched_lock); p->p_cpulimit = limp->rlim_cur; mtx_unlock_spin(&sched_lock); break; + case RLIMIT_DATA: if (limp->rlim_cur > maxdsiz) limp->rlim_cur = maxdsiz; @@ -625,6 +625,15 @@ if (limp->rlim_max < 1) limp->rlim_max = 1; break; + + case RLIMIT_PROTECT: + mtx_lock_spin(&sched_lock); + if (limp->rlim_cur) + p->p_flag |= P_PROTECTED; + else + p->p_flag &= ~P_PROTECTED; + mtx_unlock_spin(&sched_lock); + break; } *alimp = *limp; return (0); --- sys/proc.h.orig Sun Mar 23 21:36:13 2003 +++ sys/proc.h Sun Mar 23 21:37:56 2003 @@ -629,6 +629,7 @@ #define P_EXEC 0x04000 /* Process called exec. */ #define P_THREADED 0x08000 /* Process is using threads. */ #define P_CONTINUED 0x10000 /* Proc has continued from a stopped state. */ +#define P_PROTECTED 0x20000 /* Do not kill on memory overcommit. */ /* flags that control how threads may be suspended for some reason */ #define P_STOPPED_SIG 0x20000 /* Stopped due to SIGSTOP/SIGTSTP */ --- sys/resource.h.orig Sun Mar 23 22:07:50 2003 +++ sys/resource.h Sun Mar 23 22:09:45 2003 @@ -92,8 +92,9 @@ #define RLIMIT_NOFILE 8 /* number of open files */ #define RLIMIT_SBSIZE 9 /* maximum size of all socket buffers */ #define RLIMIT_VMEM 10 /* virtual process size (inclusive of mmap) */ +#define RLIMIT_PROTECT 11 /* protect process from overcommit kill */ -#define RLIM_NLIMITS 11 /* number of resource limits */ +#define RLIM_NLIMITS 12 /* number of resource limits */ #define RLIM_INFINITY ((rlim_t)(((u_quad_t)1 << 63) - 1)) @@ -115,6 +116,7 @@ "nofile", "sbsize", "vmem", + "protect", }; #endif --- vm/vm_pageout.c.orig Sun Mar 23 21:38:19 2003 +++ vm/vm_pageout.c Sun Mar 23 21:40:15 2003 @@ -1184,9 +1184,10 @@ if (PROC_TRYLOCK(p) == 0) continue; /* - * if this is a system process, skip it + * If this is a system or protected process, skip it. */ if ((p->p_flag & P_SYSTEM) || (p->p_pid == 1) || + (p->p_flag & P_PROTECTED) || ((p->p_pid < 48) && (vm_swap_size != 0))) { PROC_UNLOCK(p); continue; -- Where am I, and what am I doing in this handbasket? Wes Peters wes@softweyr.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 8:32: 2 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B474637B401 for ; Mon, 24 Mar 2003 08:31:57 -0800 (PST) Received: from mail.nsu.ru (mx.nsu.ru [193.124.215.71]) by mx1.FreeBSD.org (Postfix) with ESMTP id AF40643F85 for ; Mon, 24 Mar 2003 08:31:56 -0800 (PST) (envelope-from fjoe@iclub.nsu.ru) Received: from drweb by mail.nsu.ru with drweb-scanned (Exim 3.20 #1) id 18xUqh-0000pb-00; Mon, 24 Mar 2003 22:30:59 +0600 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mail.nsu.ru with esmtp (Exim 3.20 #1) id 18xUqg-0000oO-00; Mon, 24 Mar 2003 22:30:58 +0600 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.8/8.12.8) with ESMTP id h2OGUgj1096445; Mon, 24 Mar 2003 22:30:42 +0600 (NS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.8/8.12.8/Submit) id h2OGUf0g096444; Mon, 24 Mar 2003 22:30:42 +0600 (NS) Date: Mon, 24 Mar 2003 22:30:41 +0600 From: Max Khon To: Johan Karlsson Cc: arch@freebsd.org, Sheldon Hearn Subject: Re: [fjoe@iclub.nsu.ru: Re: thread-safe realpath] Message-ID: <20030324223041.A96310@iclub.nsu.ru> References: <20030324212205.A94544@iclub.nsu.ru> <20030324154621.GA82437@numeri.campus.luth.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20030324154621.GA82437@numeri.campus.luth.se>; from k@numeri.campus.luth.se on Mon, Mar 24, 2003 at 04:46:21PM +0100 X-Envelope-To: k@numeri.campus.luth.se, arch@freebsd.org, sheldonh@starjuice.net X-Spam-Status: No, hits=-24.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG hi, there! On Mon, Mar 24, 2003 at 04:46:21PM +0100, Johan Karlsson wrote: > > > Constantin Svintsoff has submitted thread-safe realpath() implementation > > > (implementation that does not use chdir(2)). > > > The implementation is feature-compatible with FreeBSD implementation, i.e. > > > if the last component of specified path can't be stat'ed and there is no > > > trailing slash, realpath succeeds. > > > > > > I fixed a couple of bugs in it and would like to commit it to HEAD > > > if there will be no objections. > > > > > > Test program is attached. The test simply creates two threads and calls > > > realpath() in each. If the test is compiled with truepath() #if-0'ed > > > one of the assertions fail after some time (you may need to increase > > > number of iterations if you have very fast machine, mine is Athlon 850). > > > > > > Any comments are highly appreciated. > > How does this affect PR 12244? > I've sent a patch to -audit for review a month ago and I'm about > to commit that (just doing a final make universe). > http://docs.freebsd.org/cgi/getmsg.cgi?fetch=0+0+archive/2003/freebsd-audit/20030209.freebsd-audit I do not think that this patch is needed if Constantin's version will be committed. I plan to do this in about a week. /fjoe To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 8:36:30 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 319F337B401 for ; Mon, 24 Mar 2003 08:36:25 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5070043FA3 for ; Mon, 24 Mar 2003 08:36:24 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2OGaMhV007020; Mon, 24 Mar 2003 17:36:22 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Wes Peters Cc: freebsd-arch@freebsd.org Subject: Re: Patch to protect process from pageout killing From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 24 Mar 2003 08:23:48 PST." <200303240823.48262.wes@softweyr.com> Date: Mon, 24 Mar 2003 17:36:22 +0100 Message-ID: <7019.1048523782@critter.freebsd.dk> X-Spam-Status: No, hits=-9.7 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes: >As promised, here's the patch to protect a process from being killed when >pageout is in memory shortage. This allows a process to specify that it >is important enough to be skipped when pageout is looking for the largest >process to kill. > >My needs are simple. We make a box that is a web proxy and runs from a >memory disk, using flash for permanent storage. The flash is mounted >only when a configuration write is needed, the box runs from the memory >disk. We've experienced a problem at certain customer sites where bind >will consume a lot (~30 MB) of ram and then pageout will kill the largest >process, which is usually either named or squid. This pretty much kills >the box. We'd much rather have pageout kill off some of the squid worker >processes, we can recover from that. > >Is this a good approach to the problem? Feedback welcome. (Ignoring the white-space change) I can certainly see the point, but I'm not sure this is the way. I am not sure that we want to use the resource limits facility for booleans, some of the logic sourounding the suser checks may not hold tight. Also, doesn't this result in the flag being inerited with fork() and thereby negating the effect you are seeking for squid ? Poul-Henning > >--- kern/kern_resource.c.orig Sun Mar 23 22:12:55 2003 >+++ kern/kern_resource.c Sun Mar 23 22:14:17 2003 >@@ -562,12 +562,12 @@ > } > > switch (which) { >- > case RLIMIT_CPU: > mtx_lock_spin(&sched_lock); > p->p_cpulimit = limp->rlim_cur; > mtx_unlock_spin(&sched_lock); > break; >+ > case RLIMIT_DATA: > if (limp->rlim_cur > maxdsiz) > limp->rlim_cur = maxdsiz; >@@ -625,6 +625,15 @@ > if (limp->rlim_max < 1) > limp->rlim_max = 1; > break; >+ >+ case RLIMIT_PROTECT: >+ mtx_lock_spin(&sched_lock); >+ if (limp->rlim_cur) >+ p->p_flag |= P_PROTECTED; >+ else >+ p->p_flag &= ~P_PROTECTED; >+ mtx_unlock_spin(&sched_lock); >+ break; > } > *alimp = *limp; > return (0); >--- sys/proc.h.orig Sun Mar 23 21:36:13 2003 >+++ sys/proc.h Sun Mar 23 21:37:56 2003 >@@ -629,6 +629,7 @@ > #define P_EXEC 0x04000 /* Process called exec. */ > #define P_THREADED 0x08000 /* Process is using threads. */ > #define P_CONTINUED 0x10000 /* Proc has continued from a stopped state. >*/ >+#define P_PROTECTED 0x20000 /* Do not kill on memory overcommit. */ > > /* flags that control how threads may be suspended for some reason */ > #define P_STOPPED_SIG 0x20000 /* Stopped due to SIGSTOP/SIGTSTP */ >--- sys/resource.h.orig Sun Mar 23 22:07:50 2003 >+++ sys/resource.h Sun Mar 23 22:09:45 2003 >@@ -92,8 +92,9 @@ > #define RLIMIT_NOFILE 8 /* number of open files */ > #define RLIMIT_SBSIZE 9 /* maximum size of all socket buffers */ > #define RLIMIT_VMEM 10 /* virtual process size (inclusive of mmap) */ >+#define RLIMIT_PROTECT 11 /* protect process from overcommit kill */ > >-#define RLIM_NLIMITS 11 /* number of resource limits */ >+#define RLIM_NLIMITS 12 /* number of resource limits */ > > #define RLIM_INFINITY ((rlim_t)(((u_quad_t)1 << 63) - 1)) > >@@ -115,6 +116,7 @@ > "nofile", > "sbsize", > "vmem", >+ "protect", > }; > #endif > >--- vm/vm_pageout.c.orig Sun Mar 23 21:38:19 2003 >+++ vm/vm_pageout.c Sun Mar 23 21:40:15 2003 >@@ -1184,9 +1184,10 @@ > if (PROC_TRYLOCK(p) == 0) > continue; > /* >- * if this is a system process, skip it >+ * If this is a system or protected process, skip it. > */ > if ((p->p_flag & P_SYSTEM) || (p->p_pid == 1) || >+ (p->p_flag & P_PROTECTED) || > ((p->p_pid < 48) && (vm_swap_size != 0))) { > PROC_UNLOCK(p); > continue; > > >-- > > Where am I, and what am I doing in this handbasket? > >Wes Peters wes@softweyr.com > > -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 11: 2:55 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3949037B4A4 for ; Mon, 24 Mar 2003 11:02:52 -0800 (PST) Received: from uitm.zenon.net (uitm.zenon.net [195.2.69.86]) by mx1.FreeBSD.org (Postfix) with ESMTP id 809E843FDF for ; Mon, 24 Mar 2003 11:02:45 -0800 (PST) (envelope-from uitm@zenon.net) From: Andrey Alekseyev Message-Id: <200303241902.h2OJ2a252708@uitm.zenon.net> Subject: Re: Patch to protect process from pageout killing In-Reply-To: <200303240823.48262.wes@softweyr.com> from Wes Peters at "Mar 24, 2003 08:23:48 am" To: Wes Peters Date: Mon, 24 Mar 2003 22:02:36 +0300 (MSK) Cc: freebsd-arch@freebsd.org X-Mailer: ELM [version 2.4ME+ PL61 (25)] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-3.3 required=5.0 tests=IN_REP_TO autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > As promised, here's the patch to protect a process from being killed when > pageout is in memory shortage. This allows a process to specify that it Just in case, anyone gets interested, here is another one I made about two years ago for our own needs (mass web-hosting, etc.) http://www.blackflag.ru/patches/vm_pageout.c.diff :) Allows to specify "safe" process(es) names in a sysctl variable. Doesn't touch root processes (that's what I needed as well) and sends SIGKILL if process is not willing to terminate. I recall Matt Dillon had some very useful comments about the possibility of further development of such features. Like some preferences of what processes to kill first (some other criteria in addition to the process size). -- Andrey Alekseyev. Zenon N.S.P. Senior Unix systems administrator To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 11: 9:12 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C10A537B408 for ; Mon, 24 Mar 2003 11:09:05 -0800 (PST) Received: from mail.speakeasy.net (mail17.speakeasy.net [216.254.0.217]) by mx1.FreeBSD.org (Postfix) with ESMTP id 27BA743FB1 for ; Mon, 24 Mar 2003 11:09:05 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: (qmail 19059 invoked from network); 24 Mar 2003 19:09:10 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail17.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 24 Mar 2003 19:09:10 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h2OJ92Ov093096; Mon, 24 Mar 2003 14:09:02 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200303240823.48262.wes@softweyr.com> Date: Mon, 24 Mar 2003 14:09:02 -0500 (EST) From: John Baldwin To: Wes Peters Subject: RE: Patch to protect process from pageout killing Cc: freebsd-arch@freebsd.org X-Spam-Status: No, hits=-19.5 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 24-Mar-2003 Wes Peters wrote: > As promised, here's the patch to protect a process from being killed when > pageout is in memory shortage. This allows a process to specify that it > is important enough to be skipped when pageout is looking for the largest > process to kill. > > My needs are simple. We make a box that is a web proxy and runs from a > memory disk, using flash for permanent storage. The flash is mounted > only when a configuration write is needed, the box runs from the memory > disk. We've experienced a problem at certain customer sites where bind > will consume a lot (~30 MB) of ram and then pageout will kill the largest > process, which is usually either named or squid. This pretty much kills > the box. We'd much rather have pageout kill off some of the squid worker > processes, we can recover from that. > > Is this a good approach to the problem? Feedback welcome. I think that adopting the SIGDANGER approach would be better rather than rolling our own private interface. > @@ -625,6 +625,15 @@ > if (limp->rlim_max < 1) > limp->rlim_max = 1; > break; > + > + case RLIMIT_PROTECT: > + mtx_lock_spin(&sched_lock); > + if (limp->rlim_cur) > + p->p_flag |= P_PROTECTED; > + else > + p->p_flag &= ~P_PROTECTED; > + mtx_unlock_spin(&sched_lock); > + break; p_flag is protected by PROC_LOCK, not sched_lock. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 13:35:29 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0313E37B404 for ; Mon, 24 Mar 2003 13:35:26 -0800 (PST) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E10143F75 for ; Mon, 24 Mar 2003 13:35:25 -0800 (PST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.7/8.12.7) id h2OLZKPB045421; Mon, 24 Mar 2003 15:35:20 -0600 (CST) (envelope-from dan) Date: Mon, 24 Mar 2003 15:35:20 -0600 From: Dan Nelson To: Poul-Henning Kamp Cc: Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030324213519.GA63147@dan.emsphone.com> References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7019.1048523782@critter.freebsd.dk> X-OS: FreeBSD 5.0-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i X-Spam-Status: No, hits=-26.0 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In the last episode (Mar 24), Poul-Henning Kamp said: > In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes: > > As promised, here's the patch to protect a process from being > > killed when pageout is in memory shortage. This allows a process > > to specify that it is important enough to be skipped when pageout > > is looking for the largest process to kill. > > > > My needs are simple. We make a box that is a web proxy and runs > > from a memory disk, using flash for permanent storage. The flash > > is mounted only when a configuration write is needed, the box runs > > from the memory disk. We've experienced a problem at certain > > customer sites where bind will consume a lot (~30 MB) of ram and > > then pageout will kill the largest process, which is usually either > > named or squid. This pretty much kills the box. We'd much rather > > have pageout kill off some of the squid worker processes, we can > > recover from that. > > > > Is this a good approach to the problem? Feedback welcome. > > I can certainly see the point, but I'm not sure this is the way. > > I am not sure that we want to use the resource limits facility for > booleans, some of the logic sourounding the suser checks may not hold > tight. How about changing the kill logic to look at RLIMIT_RSS? The process exceeding its limit by the largest amount gets killed. That way you can exempt certain processes by raising their limit. Set named's limit to say 10MB, and when memory gets tight the system will see it's exceeding its quota by 20MB and kill it first. -- Dan Nelson dnelson@allantgroup.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 17:23:13 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EAF6937B404 for ; Mon, 24 Mar 2003 17:23:05 -0800 (PST) Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id ED38A43FAF for ; Mon, 24 Mar 2003 17:23:04 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2P1N4ah004584; Mon, 24 Mar 2003 17:23:04 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2P1N3Ua004583; Mon, 24 Mar 2003 17:23:03 -0800 (PST) (envelope-from das@FreeBSD.ORG) Date: Mon, 24 Mar 2003 17:23:03 -0800 From: David Schultz To: Wes Peters Cc: freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325012303.GA4406@HAL9000.homeunix.com> Mail-Followup-To: Wes Peters , freebsd-arch@FreeBSD.ORG References: <200303240823.48262.wes@softweyr.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200303240823.48262.wes@softweyr.com> X-Spam-Status: No, hits=-19.6 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Thus spake Wes Peters : > As promised, here's the patch to protect a process from being killed when > pageout is in memory shortage. This allows a process to specify that it > is important enough to be skipped when pageout is looking for the largest > process to kill. > > My needs are simple. We make a box that is a web proxy and runs from a > memory disk, using flash for permanent storage. The flash is mounted > only when a configuration write is needed, the box runs from the memory > disk. We've experienced a problem at certain customer sites where bind > will consume a lot (~30 MB) of ram and then pageout will kill the largest > process, which is usually either named or squid. This pretty much kills > the box. We'd much rather have pageout kill off some of the squid worker > processes, we can recover from that. Very nice. Inheritance of this attribute seems to be a contentious issue. Making inheritance tunable might be a good idea. You wouldn't be able to piggyback on rlimit, though. There's a significant userland component of this as well, although that's probably a job for another day. It basically consists of making it possible to specify that certain standard system daemons should have this attribute. > + (p->p_flag & P_PROTECTED) || > ((p->p_pid < 48) && (vm_swap_size != 0))) { > PROC_UNLOCK(p); > continue; The pid < 48 magic can probably go away, while you're at it. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 17:28:52 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 58EA037B401 for ; Mon, 24 Mar 2003 17:28:50 -0800 (PST) Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id 92F3643FA3 for ; Mon, 24 Mar 2003 17:28:49 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2P1Siah004601; Mon, 24 Mar 2003 17:28:44 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2P1SiZ1004600; Mon, 24 Mar 2003 17:28:44 -0800 (PST) (envelope-from das@FreeBSD.ORG) Date: Mon, 24 Mar 2003 17:28:44 -0800 From: David Schultz To: Dan Nelson Cc: Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325012844.GB4406@HAL9000.homeunix.com> Mail-Followup-To: Dan Nelson , Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030324213519.GA63147@dan.emsphone.com> X-Spam-Status: No, hits=-19.6 required=5.0 tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Thus spake Dan Nelson : > In the last episode (Mar 24), Poul-Henning Kamp said: > > In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes: > > > As promised, here's the patch to protect a process from being > > > killed when pageout is in memory shortage. This allows a process > > > to specify that it is important enough to be skipped when pageout > > > is looking for the largest process to kill. > > > > > > My needs are simple. We make a box that is a web proxy and runs > > > from a memory disk, using flash for permanent storage. The flash > > > is mounted only when a configuration write is needed, the box runs > > > from the memory disk. We've experienced a problem at certain > > > customer sites where bind will consume a lot (~30 MB) of ram and > > > then pageout will kill the largest process, which is usually either > > > named or squid. This pretty much kills the box. We'd much rather > > > have pageout kill off some of the squid worker processes, we can > > > recover from that. > > > > > > Is this a good approach to the problem? Feedback welcome. > > > > I can certainly see the point, but I'm not sure this is the way. > > > > I am not sure that we want to use the resource limits facility for > > booleans, some of the logic sourounding the suser checks may not hold > > tight. > > How about changing the kill logic to look at RLIMIT_RSS? The process > exceeding its limit by the largest amount gets killed. That way you > can exempt certain processes by raising their limit. Set named's limit > to say 10MB, and when memory gets tight the system will see it's > exceeding its quota by 20MB and kill it first. I think that's overengineering the problem. First of all, it means that on any system where RLIMIT_RSS is unlimited by default, the machine now deadlocks when it runs out of memory. Second, it is only marginally useful to go as far as specifying priorities and quotas and such on process killability. Most of the time, people can divide the processes on thier system into two categories: critical and killable. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 17:52:47 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6B15137B404; Mon, 24 Mar 2003 17:52:44 -0800 (PST) Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id 356EB43F75; Mon, 24 Mar 2003 17:52:41 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 24 Mar 2003 17:52:40 -0800 From: Wes Peters Organization: Softweyr.com To: John Baldwin Subject: Re: Patch to protect process from pageout killing Date: Mon, 24 Mar 2003 17:52:40 -0800 User-Agent: KMail/1.5 Cc: freebsd-arch@freebsd.org References: In-Reply-To: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303241752.40245.wes@softweyr.com> X-OriginalArrivalTime: 25 Mar 2003 01:52:40.0618 (UTC) FILETIME=[3818F0A0:01C2F271] X-Spam-Status: No, hits=-26.0 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday 24 March 2003 11:09, John Baldwin wrote: > On 24-Mar-2003 Wes Peters wrote: > > As promised, here's the patch to protect a process from being > > killed when pageout is in memory shortage. This allows a process > > to specify that it is important enough to be skipped when pageout > > is looking for the largest process to kill. > > > > My needs are simple. We make a box that is a web proxy and runs > > from a memory disk, using flash for permanent storage. The flash > > is mounted only when a configuration write is needed, the box runs > > from the memory disk. We've experienced a problem at certain > > customer sites where bind will consume a lot (~30 MB) of ram and > > then pageout will kill the largest process, which is usually either > > named or squid. This pretty much kills the box. We'd much rather > > have pageout kill off some of the squid worker processes, we can > > recover from that. > > > > Is this a good approach to the problem? Feedback welcome. > > I think that adopting the SIGDANGER approach would be better rather > than rolling our own private interface. It's not clear to me the SIGDANGER interface allows me to say "go elsewhere bub, I'm really important." In this case, that is essential. I think even in the general FreeBSD case you can make a point for a setting like this in, say, named. The SIGDANGER interface worries me in general, partly because it's a signal and partly because it complicates the design of EVERYTHING just to handle it. I guess a lot depends on the implementation details of how SIGDANGER and the default handlers are designed, but nothing I saw last week gave me a warm fuzzy about that. > > @@ -625,6 +625,15 @@ > > if (limp->rlim_max < 1) > > limp->rlim_max = 1; > > break; > > + > > + case RLIMIT_PROTECT: > > + mtx_lock_spin(&sched_lock); > > + if (limp->rlim_cur) > > + p->p_flag |= P_PROTECTED; > > + else > > + p->p_flag &= ~P_PROTECTED; > > + mtx_unlock_spin(&sched_lock); > > + break; > > p_flag is protected by PROC_LOCK, not sched_lock. Gurk! Will fix. -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 17:59:42 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6556537B401 for ; Mon, 24 Mar 2003 17:59:39 -0800 (PST) Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id E442E43F85 for ; Mon, 24 Mar 2003 17:59:37 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 24 Mar 2003 17:59:36 -0800 From: Wes Peters Organization: Softweyr.com To: Dan Nelson , Poul-Henning Kamp Subject: Re: Patch to protect process from pageout killing Date: Mon, 24 Mar 2003 17:59:36 -0800 User-Agent: KMail/1.5 Cc: freebsd-arch@FreeBSD.ORG References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> In-Reply-To: <20030324213519.GA63147@dan.emsphone.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303241759.36410.wes@softweyr.com> X-OriginalArrivalTime: 25 Mar 2003 01:59:36.0680 (UTC) FILETIME=[3016F680:01C2F272] X-Spam-Status: No, hits=-25.7 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday 24 March 2003 13:35, Dan Nelson wrote: > In the last episode (Mar 24), Poul-Henning Kamp said: > > In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes: > > > As promised, here's the patch to protect a process from being > > > killed when pageout is in memory shortage. This allows a process > > > to specify that it is important enough to be skipped when pageout > > > is looking for the largest process to kill. > > > > > > My needs are simple. We make a box that is a web proxy and runs > > > from a memory disk, using flash for permanent storage. The flash > > > is mounted only when a configuration write is needed, the box > > > runs from the memory disk. We've experienced a problem at > > > certain customer sites where bind will consume a lot (~30 MB) of > > > ram and then pageout will kill the largest process, which is > > > usually either named or squid. This pretty much kills the box. > > > We'd much rather have pageout kill off some of the squid worker > > > processes, we can recover from that. > > > > > > Is this a good approach to the problem? Feedback welcome. > > > > I can certainly see the point, but I'm not sure this is the way. > > > > I am not sure that we want to use the resource limits facility for > > booleans, some of the logic sourounding the suser checks may not > > hold tight. > > How about changing the kill logic to look at RLIMIT_RSS? The process > exceeding its limit by the largest amount gets killed. That way you > can exempt certain processes by raising their limit. Set named's > limit to say 10MB, and when memory gets tight the system will see > it's exceeding its quota by 20MB and kill it first. Mostly because it's not possible to predict what named's RSS will be in any particular customer installation. The ones that raised this issue were at 32MB and stable, and took about 9 days to get there. We don't want named (or squid) to die under ANY circumstances; if the box can't run both named and squid it's effectively a brick. On the other hand, we have lots (hundreds) of other smaller processes running, any one of which is expendable and can be recovered from. Yeah, better ability to adapt to the (memory) load would be perhaps a better way to do this, but I really hate the idea of dumping named on it's head and restarting 9 days of learning just because we're getting hammered by people checking the weather and traffic before heading home. -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 18: 5:44 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 75D1037B404 for ; Mon, 24 Mar 2003 18:05:41 -0800 (PST) Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id 00AFA43F93 for ; Mon, 24 Mar 2003 18:05:40 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 24 Mar 2003 18:05:38 -0800 From: Wes Peters Organization: Softweyr.com To: "Poul-Henning Kamp" Subject: Re: Patch to protect process from pageout killing Date: Mon, 24 Mar 2003 18:05:38 -0800 User-Agent: KMail/1.5 Cc: freebsd-arch@freebsd.org References: <7019.1048523782@critter.freebsd.dk> In-Reply-To: <7019.1048523782@critter.freebsd.dk> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303241805.38175.wes@softweyr.com> X-OriginalArrivalTime: 25 Mar 2003 02:05:38.0430 (UTC) FILETIME=[07B5A1E0:01C2F273] X-Spam-Status: No, hits=-25.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote: > In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes: > >As promised, here's the patch to protect a process from being killed > > when pageout is in memory shortage. This allows a process to > > specify that it is important enough to be skipped when pageout is > > looking for the largest process to kill. > > > >My needs are simple. We make a box that is a web proxy and runs > > from a memory disk, using flash for permanent storage. The flash > > is mounted only when a configuration write is needed, the box runs > > from the memory disk. We've experienced a problem at certain > > customer sites where bind will consume a lot (~30 MB) of ram and > > then pageout will kill the largest process, which is usually either > > named or squid. This pretty much kills the box. We'd much rather > > have pageout kill off some of the squid worker processes, we can > > recover from that. > > > >Is this a good approach to the problem? Feedback welcome. > > (Ignoring the white-space change) OK, I put them back so the function will be inconsistent again. ;^) They accidentally got shuffled when I move my implementation from just below RLIMIT_CPU (from which it obviously and erroneously heavily borrowed) to put it in numerical order. > I can certainly see the point, but I'm not sure this is the way. > > I am not sure that we want to use the resource limits facility for > booleans, some of the logic sourounding the suser checks may not > hold tight. I had concerns about that as well. In the original (4.4 roughly) implementation I used madvise as the interface, but the madvise interface has changed greatly. It didn't seem worthwhile adding a syscall for this task, so I looked for another reasonable protected interface to ab(use). I'm 100% open to suggestions on the API. > Also, doesn't this result in the flag being inerited with fork() and > thereby negating the effect you are seeking for squid ? I looked through all the places in kern_fork.c where p2->p_flag gets set and didn't see anything that looked like it would inherit P_PROTECTED from p1->p_flag. Did I miss something? I'm obviously a bit of a neophyte in this part of the kernel. -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 18:12:59 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 896A337B401; Mon, 24 Mar 2003 18:12:56 -0800 (PST) Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id D548543F3F; Mon, 24 Mar 2003 18:12:55 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329); Mon, 24 Mar 2003 18:12:55 -0800 From: Wes Peters Organization: Softweyr.com To: David Schultz Subject: Re: Patch to protect process from pageout killing Date: Mon, 24 Mar 2003 18:12:55 -0800 User-Agent: KMail/1.5 Cc: freebsd-arch@FreeBSD.ORG References: <200303240823.48262.wes@softweyr.com> <20030325012303.GA4406@HAL9000.homeunix.com> In-Reply-To: <20030325012303.GA4406@HAL9000.homeunix.com> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303241812.55290.wes@softweyr.com> X-OriginalArrivalTime: 25 Mar 2003 02:12:55.0571 (UTC) FILETIME=[0C440E30:01C2F274] X-Spam-Status: No, hits=-25.8 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday 24 March 2003 17:23, David Schultz wrote: > Thus spake Wes Peters : > > As promised, here's the patch to protect a process from being > > killed when pageout is in memory shortage. This allows a process > > to specify that it is important enough to be skipped when pageout > > is looking for the largest process to kill. > > > > My needs are simple. We make a box that is a web proxy and runs > > from a memory disk, using flash for permanent storage. The flash > > is mounted only when a configuration write is needed, the box runs > > from the memory disk. We've experienced a problem at certain > > customer sites where bind will consume a lot (~30 MB) of ram and > > then pageout will kill the largest process, which is usually either > > named or squid. This pretty much kills the box. We'd much rather > > have pageout kill off some of the squid worker processes, we can > > recover from that. > > Very nice. Inheritance of this attribute seems to be a > contentious issue. Making inheritance tunable might be a good > idea. You wouldn't be able to piggyback on rlimit, though. Actually inheritance was unintentional, I'm waiting for feedback on what I should've done to make it not inherited. Any help you can offer will be appreciated. > There's a significant userland component of this as well, although > that's probably a job for another day. It basically consists of > making it possible to specify that certain standard system daemons > should have this attribute. Yup. > > + (p->p_flag & P_PROTECTED) || > > ((p->p_pid < 48) && (vm_swap_size != 0))) { > > PROC_UNLOCK(p); > > continue; > > The pid < 48 magic can probably go away, while you're at it. I'd be happy to -- that 48 makes me nervous -- if a couple more Really Smart(tm) guys say it's OK. ;^) -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 19:24:26 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E097E37B43A for ; Mon, 24 Mar 2003 19:24:21 -0800 (PST) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 31CF343FB1 for ; Mon, 24 Mar 2003 19:24:21 -0800 (PST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.7/8.12.7) id h2P3OKhi096231; Mon, 24 Mar 2003 21:24:20 -0600 (CST) (envelope-from dan) Date: Mon, 24 Mar 2003 21:24:20 -0600 From: Dan Nelson To: Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325032420.GA22424@dan.emsphone.com> References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030325012844.GB4406@HAL9000.homeunix.com> X-OS: FreeBSD 5.0-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i X-Spam-Status: No, hits=-26.0 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In the last episode (Mar 24), David Schultz said: > Thus spake Dan Nelson : > > How about changing the kill logic to look at RLIMIT_RSS? The > > process exceeding its limit by the largest amount gets killed. > > That way you can exempt certain processes by raising their limit. > > Set named's limit to say 10MB, and when memory gets tight the > > system will see it's exceeding its quota by 20MB and kill it first. > > I think that's overengineering the problem. First of all, it means > that on any system where RLIMIT_RSS is unlimited by default, the > machine now deadlocks when it runs out of memory. Second, it is only > marginally useful to go as far as specifying priorities and quotas > and such on process killability. Most of the time, people can divide > the processes on thier system into two categories: critical and > killable. RSS overcommit would be the first sort priority. If nothing is over its limit, you fall back on the old "biggest process dies" rule. Set the critical processes at an infinite RSS, set the killable ones at a reasonable RSS, set your cannon fodder processes at 0. The default RSS is infinity so you get classic behaviour. In the embedded server case, there's no swap space so the RSS limit isn't even being used anyway. There is still the inheritance problem, though, so a RSS=inf daemon would have to manually set the rlimit to 0 after forking a killable process. -- Dan Nelson dnelson@allantgroup.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 20:22:56 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B9B0D37B404; Mon, 24 Mar 2003 20:22:52 -0800 (PST) Received: from smtp1.server.rpi.edu (smtp1.server.rpi.edu [128.113.2.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 74C7D43F85; Mon, 24 Mar 2003 20:22:49 -0800 (PST) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp1.server.rpi.edu (8.12.8/8.12.7) with ESMTP id h2P4MiBg001437; Mon, 24 Mar 2003 23:22:44 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: <20030325012844.GB4406@HAL9000.homeunix.com> References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com> Date: Mon, 24 Mar 2003 23:22:43 -0500 To: David Schultz , Dan Nelson From: Garance A Drosihn Subject: Re: Patch to protect process from pageout killing Cc: Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: MIMEDefang 2.28 X-Spam-Status: No, hits=-24.5 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG At 5:28 PM -0800 3/24/03, David Schultz wrote: > Second, it is only marginally useful to go as far as specifying >priorities and quotas and such on process killability. Most of >the time, people can divide the processes on thier system into >two categories: critical and killable. While that's probably true "most of the time", I think we'd want to encourage three categories. critical, less-critical (killable), and kill-me-first. That's what SIGDANGER provides, and in some situations that third category is very desirable. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 20:53:36 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 55CF337B401; Mon, 24 Mar 2003 20:53:33 -0800 (PST) Received: from smtp2.server.rpi.edu (smtp2.server.rpi.edu [128.113.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 05DF843F3F; Mon, 24 Mar 2003 20:53:32 -0800 (PST) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp2.server.rpi.edu (8.12.8/8.12.7) with ESMTP id h2P4rUn6014574; Mon, 24 Mar 2003 23:53:30 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: <200303241752.40245.wes@softweyr.com> References: <200303241752.40245.wes@softweyr.com> Date: Mon, 24 Mar 2003 23:53:29 -0500 To: Wes Peters , John Baldwin From: Garance A Drosihn Subject: Re: Patch to protect process from pageout killing Cc: freebsd-arch@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: MIMEDefang 2.28 X-Spam-Status: No, hits=-25.3 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG At 5:52 PM -0800 3/24/03, Wes Peters wrote: >On Monday 24 March 2003 11:09, John Baldwin wrote: > > I think that adopting the SIGDANGER approach would be better > > rather than rolling our own private interface. > >It's not clear to me the SIGDANGER interface allows me to say >"go elsewhere bub, I'm really important." In this case, that >is essential. I think even in the general FreeBSD case you can >make a point for a setting like this in, say, named. Please check out the descriptions I posted previously. SIGDANGER (as implemented by AIX) explicitly provides two things. The process gets to decide which one they (the process) wants: 1) signal me at the first sign of trouble, and I'll free up some virtual memory (possibly by exit()-ing). 2) do not ever kill me to free up memory. I think that we could improve upon the AIX implementation if we wanted to, but I think people are so used to having problems with AIX that they hate the idea of SIGDANGER as soon as they see the letters AIX. Having used AIX for more than ten years now, I can sympathize with that, but in the specific case of SIGDANGER there is an idea that can work quite well. (reference on sigdanger was at: http://nscp.upenn.edu/aix4.3html/aixbman/baseadmn/pag_space_under.htm ) >The SIGDANGER interface worries me in general, partly because it's >a signal and partly because it complicates the design of EVERYTHING >just to handle it. I guess a lot depends on the implementation >details of how SIGDANGER and the default handlers are designed, >but nothing I saw last week gave me a warm fuzzy about that. I don't know enough about the lower-level implementation details, but I did think the recent discussion on the src-committers list did include a number of good ideas. I am horribly over-committed with things that I've promised to do (including stuff for my real- world job...), so I can't look into SIGDANGER ideas right now, but I'm more than happy to try to explain how it should work. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 22:10:44 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C1ABE37B401; Mon, 24 Mar 2003 22:10:38 -0800 (PST) Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 19FAC43F75; Mon, 24 Mar 2003 22:10:38 -0800 (PST) (envelope-from wes@softweyr.com) Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id D929F4336E; Mon, 24 Mar 2003 22:10:35 -0800 (PST) From: Wes Peters Organization: Softweyr To: Garance A Drosihn , John Baldwin Subject: Re: Patch to protect process from pageout killing Date: Mon, 24 Mar 2003 22:10:32 -0800 User-Agent: KMail/1.5 Cc: freebsd-arch@FreeBSD.ORG References: <200303241752.40245.wes@softweyr.com> In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303242210.32055.wes@softweyr.com> X-Spam-Status: No, hits=-16.1 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Monday 24 March 2003 20:53, Garance A Drosihn wrote: > At 5:52 PM -0800 3/24/03, Wes Peters wrote: > > > >It's not clear to me the SIGDANGER interface allows me to say > >"go elsewhere bub, I'm really important." In this case, that > >is essential. I think even in the general FreeBSD case you can > >make a point for a setting like this in, say, named. > > Please check out the descriptions I posted previously. SIGDANGER > (as implemented by AIX) explicitly provides two things. The process > gets to decide which one they (the process) wants: > > 1) signal me at the first sign of trouble, and I'll free > up some virtual memory (possibly by exit()-ing). > 2) do not ever kill me to free up memory. The current situation, leave me alone until you're really hurting, then just kill me quickly, should not only be an option but be the default. Is that covered? As the default? I.e. if I don't specify any handling of SIGDANGER at all, does it continue to work as now? I guess my biggest worry about SIGDANGER is that minds much brighter than yours or mine share my worries about it. Relying on signal delivery is just not in my nature. > I think that we could improve upon the AIX implementation if we > wanted to, but I think people are so used to having problems with > AIX that they hate the idea of SIGDANGER as soon as they see the > letters AIX. Yeah, that's pretty much my knee-jerk reaction. I haven't really used AIX since about 3.2.5, but it was just fugly in those days. > Having used AIX for more than ten years now, I can > sympathize with that, but in the specific case of SIGDANGER there > is an idea that can work quite well. > > (reference on sigdanger was at: > http://nscp.upenn.edu/aix4.3html/aixbman/baseadmn/pag_space_under.htm > ) > > >The SIGDANGER interface worries me in general, partly because it's > >a signal and partly because it complicates the design of EVERYTHING > >just to handle it. I guess a lot depends on the implementation > >details of how SIGDANGER and the default handlers are designed, > >but nothing I saw last week gave me a warm fuzzy about that. > > I don't know enough about the lower-level implementation details, > but I did think the recent discussion on the src-committers list > did include a number of good ideas. I am horribly over-committed > with things that I've promised to do (including stuff for my real- > world job...), so I can't look into SIGDANGER ideas right now, but > I'm more than happy to try to explain how it should work. Some of the explanations were reasonable enough to erase all of my objections EXCEPT "it's a signal." Do we have signal delivery to multi-threaded processes worked out enough to rely on SIGDANGER for such a critical function? If so, it's news to me, but that doesn't mean it's not done... -- Where am I, and what am I doing in this handbasket? Wes Peters wes@softweyr.com To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 22:53:10 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E81AF37B401; Mon, 24 Mar 2003 22:53:04 -0800 (PST) Received: from smtp3.server.rpi.edu (smtp3.server.rpi.edu [128.113.2.3]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0D72E43F93; Mon, 24 Mar 2003 22:53:04 -0800 (PST) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47]) by smtp3.server.rpi.edu (8.12.8/8.12.7) with ESMTP id h2P6r2QA023930; Tue, 25 Mar 2003 01:53:02 -0500 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: <200303242210.32055.wes@softweyr.com> References: <200303241752.40245.wes@softweyr.com> <200303242210.32055.wes@softweyr.com> Date: Tue, 25 Mar 2003 01:53:01 -0500 To: Wes Peters , John Baldwin From: Garance A Drosihn Subject: Re: Patch to protect process from pageout killing Cc: freebsd-arch@FreeBSD.ORG Content-Type: text/plain; charset="us-ascii" ; format="flowed" X-Scanned-By: MIMEDefang 2.28 X-Spam-Status: No, hits=-25.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG At 10:10 PM -0800 3/24/03, Wes Peters wrote: >On Monday 24 March 2003 20:53, Garance A Drosihn wrote: >> At 5:52 PM -0800 3/24/03, Wes Peters wrote: >> > >> >It's not clear to me the SIGDANGER interface allows me to say >> >"go elsewhere bub, I'm really important." In this case, that >> >is essential. I think even in the general FreeBSD case you can >> >make a point for a setting like this in, say, named. >> >> Please check out the descriptions I posted previously. SIGDANGER >> (as implemented by AIX) explicitly provides two things. The process >> gets to decide which one they (the process) wants: >> >> 1) signal me at the first sign of trouble, and I'll free >> up some virtual memory (possibly by exit()-ing). >> 2) do not ever kill me to free up memory. > >The current situation, leave me alone until you're really hurting, >then just kill me quickly, should not only be an option but be the >default. Is that covered? As the default? I.e. if I don't >specify any handling of SIGDANGER at all, does it continue to >work as now? Yes. >I guess my biggest worry about SIGDANGER is that minds much brighter >than yours or mine share my worries about it. Relying on signal >delivery is just not in my nature. Actually, I think the biggest complaint with SIGDANGER (as AIX does it), is that you *must* recompile programs to add the signal-handler, or SIGDANGER does you absolutely no good. This leads to the argument "what if I don't have the source to some program that should not be killed?". Or, for that matter, "what if I don't have the source to a program which I know should be among the first to die?" This is an area where I think we could do better than the AIX implementation, although "doing better" does imply "more work"... I think we want to come up with something so people don't have to go changing every program to add a signal handler, but the decision would usually be left to the system-admin. > > I don't know enough about the lower-level implementation details, >> but I did think the recent discussion on the src-committers list >> did include a number of good ideas. I am horribly over-committed >> with things that I've promised to do (including stuff for my real- >> world job...), so I can't look into SIGDANGER ideas right now, but >> I'm more than happy to try to explain how it should work. > >Some of the explanations were reasonable enough to erase all of my >objections EXCEPT "it's a signal." Do we have signal delivery to >multi-threaded processes worked out enough to rely on SIGDANGER >for such a critical function? If so, it's news to me, but that >doesn't mean it's not done... Well, most signal handlers for SIGDANGER are very simple, so they should tend to work even if signal-handling in general is iffy. They either: static void ignore_danger(int signo) { /* Just return, thus telling the kernel "Do Not Kill Me" */ } or static void we_are_really_nice(int signo) { /* System is running out of VM -- so we will disappear! */ exit(1); } Well, those are the two kinds I have written. I guess the second one could be a lot more complicated. Really it should set a flag and then let some other main-processing-loop do the exit() call. I don't know what that means for multi-threaded programs under freebsd, but since you don't *have* to add a signal-handler to every program, it might be that most system administrators will be able to solve their low-memory issues even if signal-handling did not work reliably for all programs. -- Garance Alistair Drosehn = gad@gilead.netel.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Mon Mar 24 23:53:52 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3E17D37B401 for ; Mon, 24 Mar 2003 23:53:48 -0800 (PST) Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id 75FCD43F85 for ; Mon, 24 Mar 2003 23:53:47 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2P7rhah005535; Mon, 24 Mar 2003 23:53:43 -0800 (PST) (envelope-from das@FreeBSD.ORG) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2P7rgQl005534; Mon, 24 Mar 2003 23:53:42 -0800 (PST) (envelope-from das@FreeBSD.ORG) Date: Mon, 24 Mar 2003 23:53:42 -0800 From: David Schultz To: Garance A Drosihn Cc: Dan Nelson , Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325075342.GA5450@HAL9000.homeunix.com> Mail-Followup-To: Garance A Drosihn , Dan Nelson , Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, hits=-19.6 required=5.0 tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Thus spake Garance A Drosihn : > At 5:28 PM -0800 3/24/03, David Schultz wrote: > > Second, it is only marginally useful to go as far as specifying > >priorities and quotas and such on process killability. Most of > >the time, people can divide the processes on thier system into > >two categories: critical and killable. > > While that's probably true "most of the time", I think we'd want to > encourage three categories. critical, less-critical (killable), > and kill-me-first. That's what SIGDANGER provides, and in some > situations that third category is very desirable. Yes, I think the SIGDANGER idea makes sense. Essentially what you'd want is a higher threshhold above the ``red alert---start killing things'' threshhold where you can do smart things like send SIGDANGERs without worrying about running completely out of memory. But I'm trying to impress on people that SIGDANGER is orthogonal to what Wes is trying to do, before the whole thing gets bogged down in discussions again and nothing ever happens. Here's an example of what I mean in verbose pseudocode with fudged constants: if (free VM < 64 pages) { /* This is the part Wes is working on. */ kill big processes EXCEPT the ones that are so important that there's no point in running the system without them; } else if (free VM < 256 pages) { /* * It takes additional memory to do this, but we're * hoping some processes will cooperate and the * shortage will go away. */ start warning processes with SIGDANGER; } To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 0:26:10 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1306037B401; Tue, 25 Mar 2003 00:26:06 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id 401E843FA3; Tue, 25 Mar 2003 00:26:05 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2P8PrhV014383; Tue, 25 Mar 2003 09:25:53 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: David Schultz Cc: Garance A Drosihn , Dan Nelson , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 24 Mar 2003 23:53:42 PST." <20030325075342.GA5450@HAL9000.homeunix.com> Date: Tue, 25 Mar 2003 09:25:53 +0100 Message-ID: <14382.1048580753@critter.freebsd.dk> X-Spam-Status: No, hits=-6.5 required=5.0 tests=AWL,IN_REP_TO autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG >But I'm trying to impress on people that SIGDANGER is >orthogonal to what Wes is trying to do, before the whole thing >gets bogged down in discussions again and nothing ever happens. >Here's an example of what I mean in verbose pseudocode with >fudged constants: If we are going to do this, we should do it right. Doing it right means that we should also be sharing enough information with userland, so that userland can adapt. Take a simple example: It makes sense for a program like fsck to use all the RAM it can get hold off as cache, but it does not make sense for the cache to be paged out. As I see it, there is a need for several mechanisms: 1. A mechanism to export to userland enough information about the current RAM availability, so that phkmalloc and application specific code can make intelligent choices before things go bad. 2. A mechanism to alert userland to the fact that things _have_ gone bad. 3. A mechanism to influence the "Who do we kill ?" decision once things have gone from bad to worse. To tackle them from behind: Wes has a proposal for #3 which is a per-process flag which says "I'm sacred". I think that is a sound principle since that is usually exactly what people want: Do Not Kill This Process. Certain processes already enjoy special protection, pid==1 most notably, this would just be a way to make the same protection available to other processes. I'm not happy about using the resourcelimit code for booleans, and I don't think the flag should be inherited, but otherwise I'm for the idea. We have the SIGDANGER proposal for #2, but I think we need to have two severities: "Out of RAM" and "Out of VM". A program like fsck would start to recycle cached sectors once we're out of RAM. But I have not seen anybody come up with a good proposal for #1, and that is where the main benefit would be derived: It would allow processes to be good citizens and adjust to the present situation. Traditionally userland code is totally oblivious to the overall system circumstances, the most notable exception is sendmail which for ages have monitored the loadavg and backe off accordingly. I think all daemons, and even some non-daemon programs, can benefit from being aware of more of the systems situation: phkmalloc would automatically shed the cache and go into "hinting" mode if there were any pageing activity. Daemons like named can shed caches. Long running daemons could even go through a garbage collect to reduce their memory footprint (using realloc() to reduce fragmentation). Bgfsck can shed all cache and take a nap. Sort can use smaller buckets. The signals in #2 could be used as a cheap substitute for this, but we would need to add complementary "All Clear" signals to get processes out of "contingency mode". I have often wondered about making a single page of "kernel info" which would be read-only mapped into all processes, (my main agenda is really evil timekeeping), but it would also be the perfect place for information like: "N free pages in system" "N pages of swap used" "N pages paged out during the last 1/15/60 seconds" "N pages paged in during the last 1/15/60 seconds" ... And with cheap access to that information, processes could much easier taylor their behaviour. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 0:43: 4 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7A82837B401; Tue, 25 Mar 2003 00:43:01 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0272543F3F; Tue, 25 Mar 2003 00:43:00 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2P8gmKu015264; Tue, 25 Mar 2003 00:42:48 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2P8glvn017322; Tue, 25 Mar 2003 00:42:47 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2P8glDU017321; Tue, 25 Mar 2003 00:42:47 -0800 (PST) Date: Tue, 25 Mar 2003 00:42:47 -0800 From: Marcel Moolenaar To: Poul-Henning Kamp Cc: David Schultz , Garance A Drosihn , Dan Nelson , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325084247.GA17195@dhcp01.pn.xcllnt.net> References: <20030325075342.GA5450@HAL9000.homeunix.com> <14382.1048580753@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <14382.1048580753@critter.freebsd.dk> User-Agent: Mutt/1.5.3i X-Spam-Status: No, hits=-30.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Mar 25, 2003 at 09:25:53AM +0100, Poul-Henning Kamp wrote: > > 3. A mechanism to influence the "Who do we kill ?" decision once > things have gone from bad to worse. > > To tackle them from behind: > > Wes has a proposal for #3 which is a per-process flag which says > "I'm sacred". I think that is a sound principle since that is > usually exactly what people want: Do Not Kill This Process. > > Certain processes already enjoy special protection, pid==1 most > notably, this would just be a way to make the same protection > available to other processes. I'm not happy about using the > resourcelimit code for booleans, and I don't think the flag > should be inherited, but otherwise I'm for the idea. JFYI: On ia64 there are 12 bits in the ELF header reserved for OS specific flags. A very natural way to flag a process as being sacred is by flagging the ELF executable. You could use brandelf for that. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 0:48:45 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5475437B401; Tue, 25 Mar 2003 00:48:43 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id B161843F3F; Tue, 25 Mar 2003 00:48:41 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2P8mXhV014595; Tue, 25 Mar 2003 09:48:33 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Marcel Moolenaar Cc: David Schultz , Garance A Drosihn , Dan Nelson , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 25 Mar 2003 00:42:47 PST." <20030325084247.GA17195@dhcp01.pn.xcllnt.net> Date: Tue, 25 Mar 2003 09:48:33 +0100 Message-ID: <14594.1048582113@critter.freebsd.dk> X-Spam-Status: No, hits=-7.1 required=5.0 tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20030325084247.GA17195@dhcp01.pn.xcllnt.net>, Marcel Moolenaar writ es: >> To tackle them from behind: >> >> Wes has a proposal for #3 which is a per-process flag which says >> "I'm sacred". I think that is a sound principle since that is >> usually exactly what people want: Do Not Kill This Process. >> >> Certain processes already enjoy special protection, pid==1 most >> notably, this would just be a way to make the same protection >> available to other processes. I'm not happy about using the >> resourcelimit code for booleans, and I don't think the flag >> should be inherited, but otherwise I'm for the idea. > >JFYI: On ia64 there are 12 bits in the ELF header reserved for OS >specific flags. A very natural way to flag a process as being sacred >is by flagging the ELF executable. You could use brandelf for that. Many years ago, we had a local hack so you could specify the nice(2) that a given program would be executed at (relative to the parent process) in the a.out file. This allowed us to keep games open during the day because we could argue that running at -20 they used only resources not otherwise claimed. Other operating systems have much more expressive facilities for putting attributes on a program. In some cases this is being held stronly against them. I think, but am not sure, that we can now introduce practically any policy we might like with MAC. (NB: deliberate rwatson-trigger) How the flags/attributes gets to be set on the wanted subset of processes is by no means uninteresting, but until something pays attention to the flag... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 2:45: 5 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2683037B401 for ; Tue, 25 Mar 2003 02:45:00 -0800 (PST) Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7758A43FD7 for ; Tue, 25 Mar 2003 02:44:59 -0800 (PST) (envelope-from das@FreeBSD.org) Received: from HAL9000.homeunix.com (localhost [127.0.0.1]) by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2PAitah006067; Tue, 25 Mar 2003 02:44:55 -0800 (PST) (envelope-from das@FreeBSD.org) Received: (from das@localhost) by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2PAisFv006066; Tue, 25 Mar 2003 02:44:54 -0800 (PST) (envelope-from das@FreeBSD.org) Date: Tue, 25 Mar 2003 02:44:54 -0800 From: David Schultz To: Poul-Henning Kamp Cc: Garance A Drosihn , Dan Nelson , Wes Peters , freebsd-arch@FreeBSD.org Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325104454.GA5934@HAL9000.homeunix.com> Mail-Followup-To: Poul-Henning Kamp , Garance A Drosihn , Dan Nelson , Wes Peters , freebsd-arch@FreeBSD.org References: <20030325075342.GA5450@HAL9000.homeunix.com> <14382.1048580753@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <14382.1048580753@critter.freebsd.dk> X-Spam-Status: No, hits=-19.6 required=5.0 tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Thus spake Poul-Henning Kamp : > As I see it, there is a need for several mechanisms: > > 1. A mechanism to export to userland enough information about the > current RAM availability, so that phkmalloc and application > specific code can make intelligent choices before things go bad. > > 2. A mechanism to alert userland to the fact that things _have_ gone > bad. > > 3. A mechanism to influence the "Who do we kill ?" decision once > things have gone from bad to worse. I completely agree, and in my last email I attempted to address the fact that #2 and #3 are distinct, and to say that people shouldn't be complaining about Wes's solution to #3 because it doesn't address #2. For #1 and #2, we could have a SIGVM (your terminology from the *last* time this came up) to notify processes about material changes in global resource availability. Applications could then look at that "kernel info" page upon receiving the signal and take appropriate action. I think the hardest part is getting applications to use a proprietary facility. (For example, look at how few people are using kqueue for all of its advantages.) Certainly it could be added to base system programs, but it would be most useful for applications such as postgresql and apache. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 8:35: 0 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0060537B404 for ; Tue, 25 Mar 2003 08:34:53 -0800 (PST) Received: from mail.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by mx1.FreeBSD.org (Postfix) with ESMTP id 661E843F3F for ; Tue, 25 Mar 2003 08:34:53 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: (qmail 18913 invoked from network); 25 Mar 2003 16:34:56 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP for ; 25 Mar 2003 16:34:56 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h2PGYoOv096194; Tue, 25 Mar 2003 11:34:50 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200303241805.38175.wes@softweyr.com> Date: Tue, 25 Mar 2003 11:34:50 -0500 (EST) From: John Baldwin To: Wes Peters Subject: Re: Patch to protect process from pageout killing Cc: freebsd-arch@freebsd.org, Poul-Henning Kamp X-Spam-Status: No, hits=-19.5 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 25-Mar-2003 Wes Peters wrote: > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote: >> Also, doesn't this result in the flag being inerited with fork() and >> thereby negating the effect you are seeking for squid ? > > I looked through all the places in kern_fork.c where p2->p_flag gets set > and didn't see anything that looked like it would inherit P_PROTECTED > from p1->p_flag. Did I miss something? I'm obviously a bit of a > neophyte in this part of the kernel. rlimit's are inherited. However, due to a "feature" bug in your patch, the P_PROTECTED flag doesn't get turned on when the rlimit is inherited in fork1(). -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 9: 6:47 2003 Delivered-To: freebsd-arch@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 683) id 368B137B401; Tue, 25 Mar 2003 09:06:44 -0800 (PST) Date: Tue, 25 Mar 2003 09:06:43 -0800 From: Eivind Eklund To: David Schultz Cc: Garance A Drosihn , Dan Nelson , Poul-Henning Kamp , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing Message-ID: <20030325090643.F20745@FreeBSD.org> References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com> <20030325075342.GA5450@HAL9000.homeunix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <20030325075342.GA5450@HAL9000.homeunix.com>; from das@FreeBSD.ORG on Mon, Mar 24, 2003 at 11:53:42PM -0800 X-Spam-Status: No, hits=-32.5 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, Mar 24, 2003 at 11:53:42PM -0800, David Schultz wrote: > Yes, I think the SIGDANGER idea makes sense. Essentially what > you'd want is a higher threshhold above the ``red alert---start > killing things'' threshhold where you can do smart things like > send SIGDANGERs without worrying about running completely out of > memory. But I'm trying to impress on people that SIGDANGER is > orthogonal to what Wes is trying to do, before the whole thing > gets bogged down in discussions again and nothing ever happens. > Here's an example of what I mean in verbose pseudocode with > fudged constants: > > if (free VM < 64 pages) { > /* This is the part Wes is working on. */ > kill big processes EXCEPT the ones that > are so important that there's no point > in running the system without them; > } else if (free VM < 256 pages) { > /* > * It takes additional memory to do this, but we're > * hoping some processes will cooperate and the > * shortage will go away. > */ > start warning processes with SIGDANGER; > } As far as I understand, this problem was covered by the SIGDANGER proposal, by having processes with a SIGDANGER handler not be killed in the (free VM < 64 pages) case, at least until there are no processes without a SIGDANGER handler. The pseudo-code becomes something like (with 64, 128 and 256 being arbitrary constants) if (free VM < 256 pages) { send SIGDANGER to all processes } while (free VM < 128 pages && we have processes without SIGDANGER handler) { kill "worst" process without SIGDANGER handler } while (free VM < 64 pages) { /* * Only goes here if we are out of processes without * SIGDANGER handler */ kill "worst" process } As you see, SIGDANGER says that the process wants to decide for itself if it is important (kept until 64) or want to die/free up resources at 256. I'm not 100% happy with the SIGDANGER API, for the following reasons: (1) There are cases it does not cover. I can implement a process that is not really significant, but does caching and can easily free up memory. However, even though it frees up memory, it should not get special priority in the killing sequence. The most extreme example of this is if we add SIGDANGER awareness to phkmalloc - in that case, all newly compiled programs (that use libc and malloc) would gain priority, while all old binaries would be prioritized lower. (2) The use an API instead of an external configuration option (e.g. a sysctl with a list of protected PIDs) makes it impossible to use this without having recompiling software. It also means that priority is determined at the time of software implementation, not when the software is deployed, unless there are special options in the software to change behaviour. And these options are likely to appear, which basically sucks. However, I still feel that we *should* support the SIGDANGER API. We need an API to cover the case where a program has resources that it can easily release, and the API should be cross-platform. By supporting the SIGDANGER API on FreeBSD too, that API becomes aboutr 4x as "legitimate" as it is today. If we implement another API, both SIGDANGER and that API will be seen as less legitimate than SIGDANGER is today, unless that API is *much* better than SIGDANGER. Thus, we lower the chance of ever getting a true cross-platform solution. I feel there also is room for a separate solution that lets the administrator determine processes to keep - but this should not block for implementation of SIGDANGER with the standard semantics, and that is IMO what would be most important to have. Also note that a SIGDANGER implementation might automatically be picked up by autoconf for already existing programs, giving an immediate benefit. Eivind, who realize he has no vote until he has patches. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 9:22:31 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C63A437B401; Tue, 25 Mar 2003 09:22:27 -0800 (PST) Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id EF33643FAF; Tue, 25 Mar 2003 09:22:16 -0800 (PST) (envelope-from dcs@tcoip.com.br) Received: from tcoip.com.br ([10.0.2.6]) by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2PHLf923653; Tue, 25 Mar 2003 14:21:41 -0300 Message-ID: <3E809024.1050303@tcoip.com.br> Date: Tue, 25 Mar 2003 14:21:40 -0300 From: "Daniel C. Sobral" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030214 X-Accept-Language: en-us, en, pt-br, ja MIME-Version: 1.0 To: Poul-Henning Kamp Cc: David Schultz , Garance A Drosihn , Dan Nelson , Wes Peters , freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing References: <14382.1048580753@critter.freebsd.dk> In-Reply-To: <14382.1048580753@critter.freebsd.dk> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-19.2 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp wrote: > If we are going to do this, we should do it right. > > Doing it right means that we should also be sharing enough information > with userland, so that userland can adapt. > > Take a simple example: It makes sense for a program like fsck to > use all the RAM it can get hold off as cache, but it does not make > sense for the cache to be paged out. > > As I see it, there is a need for several mechanisms: SIGDANGER actually takes care of two of these steps: > > 1. A mechanism to export to userland enough information about the > current RAM availability, so that phkmalloc and application > specific code can make intelligent choices before things go bad. > > 2. A mechanism to alert userland to the fact that things _have_ gone > bad. SIGDANGER is sent to processes at threshold #1, alerting them the situation has become serious. > 3. A mechanism to influence the "Who do we kill ?" decision once > things have gone from bad to worse. When the situation becomes so critical that the system cannot proceed without first killing something, it will only kill a process which has registered SIGDANGER if there are no other suitable process. -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: Daniel.Capo@tco.net.br Daniel.Sobral@tcoip.com.br dcs@tcoip.com.br Outros: dcs@newsguy.com dcs@freebsd.org capo@notorious.bsdconspiracy.net If you put your supper dish to your ear you can hear the sounds of a restaurant. -- Snoopy To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Mar 25 18:59:46 2003 Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D4A9F37B401 for ; Tue, 25 Mar 2003 18:59:41 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1402943F3F for ; Tue, 25 Mar 2003 18:59:41 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q2xeT43472 for ; Tue, 25 Mar 2003 21:59:40 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Tue, 25 Mar 2003 21:59:40 -0500 (EST) From: Jeff Roberson To: arch@freebsd.org Subject: 1:1 Threading implementation. Message-ID: <20030325214028.K64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=0.0 required=5.0 tests=none version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I realize that many people have strong feelings on this topic. I'm asking everyone up front to try not to devolve this thread into a bikeshed. Thanks to the foundation provided by Julian, David Xu, Mini, Dan Eischen, and everyone else who has participated with KSE and libpthread development Mini and I have developed a 1:1 threading implementation. This code works in parallel with KSE and does not break it in any way. It actually helps bring M:N threading closer by testing out shared bits. I have successfully run mozilla 1.2.1 using this threading package. It still has some bugs and some incomplete corners but we're very close to being able to commit this. I'm going to post a link to the kernel portion of this code at the end of this mail. The library will come later. What this means is that for every pthread in an application there is one KSE and thread. There is also only one ksegroup per proc in this model. Since the kernel knows about all threads it handles all scheduling decisions and all signal delivery. I have followed the POSIX spec while implementing the signal code. I would really appreciate review from anyone who is intimately familiar with signals and threads. Included in this is an implementation of sigwait(), sigtimedwait(), and sigwaitinfo(). The user land mutexes are supported by kernel code. Uncontested acquires and releases are done entirely in application space using atomic instructions. Once there is contention the library falls back to system calls to handle the locks. There are no per lock kernel resources allocated. There is a user space safe atomic cmpset function that has been defined for x86 only at the moment. New architectures require only this function and the *context apis to run this threading package. There is no arch specific code in user space. The condition variables and other blocking situations are handled with sig*wait*() and a new signal, SIGTHR. There are many reasons that we went with a signal here. If anyone cares to know them, you may ask. There are only 4 system calls for threading. thr_create, thr_self, thr_exit, and thr_kill. The rest of the functionality is implemented in a library that has been heavily hacked up from the original libc_r. The reason we're doing this in parallel with the M:N effort is so that we can have reasonable threading sooner. As I stated before, this project is complimentary to KSE and does not prohibit it from working. I also think that the performance will be better or comparable in the majority of real applications. The kernel bits are available at http://www.chesapeake.net/~jroberson/thr.diff I'd like to get the signal code commited asap. It's the majority of the patch and I often have to resolve conflicts. There have been no regressions in KSE or non threaded applications with this signal code. Cheers, Jeff To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch@FreeBSD.ORG Tue Mar 25 19:52:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 657D537B401 for ; Tue, 25 Mar 2003 19:52:13 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 93FF843F75 for ; Tue, 25 Mar 2003 19:52:12 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q3qCC65884 for ; Tue, 25 Mar 2003 22:52:12 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Tue, 25 Mar 2003 22:52:12 -0500 (EST) From: Jeff Roberson To: arch@FreeBSD.ORG In-Reply-To: <20030325214028.K64602-100000@mail.chesapeake.net> Message-ID: <20030325224156.F64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,IN_REP_TO autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 03:52:13 -0000 X-List-Received-Date: Wed, 26 Mar 2003 03:52:13 -0000 I pooched the patch. It's updated at the same web address. http://www.chesapeake.net/~jroberson/thr.diff Cheers, Jeff From owner-freebsd-arch@FreeBSD.ORG Tue Mar 25 23:00:33 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1468537B412 for ; Tue, 25 Mar 2003 23:00:33 -0800 (PST) Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com [218.97.164.167]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2BEC343F3F for ; Tue, 25 Mar 2003 23:00:29 -0800 (PST) (envelope-from davidxu@freebsd.org) Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id HLDQN57S; Wed, 26 Mar 2003 14:46:32 +0800 Message-ID: <00f101c2f365$8de4e530$f001a8c0@davidw2k> From: "David Xu" To: "Jeff Roberson" , References: <20030325224156.F64602-100000@mail.chesapeake.net> Date: Wed, 26 Mar 2003 15:01:41 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 X-Spam-Status: No, hits=-10.1 required=5.0 tests=AWL,QUOTED_EMAIL_TEXT,REFERENCES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 07:00:34 -0000 X-List-Received-Date: Wed, 26 Mar 2003 07:00:34 -0000 I am reading the code, although not very understand your idea, but I found a problem, if a thread exits, some signals taken by the thread will be lost even the signal originally is not for the thread. David Xu ----- Original Message -----=20 From: "Jeff Roberson" To: Sent: Wednesday, March 26, 2003 11:52 AM Subject: Re: 1:1 Threading implementation. > I pooched the patch. It's updated at the same web address. >=20 > http://www.chesapeake.net/~jroberson/thr.diff >=20 > Cheers, > Jeff >=20 > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to = "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 00:37:01 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CD37537B404 for ; Wed, 26 Mar 2003 00:37:01 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id DEE3D43FB1 for ; Wed, 26 Mar 2003 00:37:00 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q8avX78331; Wed, 26 Mar 2003 03:36:57 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 26 Mar 2003 03:36:57 -0500 (EST) From: Jeff Roberson To: Julian Elischer In-Reply-To: Message-ID: <20030326031245.O64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-13.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 08:37:03 -0000 X-List-Received-Date: Wed, 26 Mar 2003 08:37:03 -0000 On Wed, 26 Mar 2003, Julian Elischer wrote: > On Tue, 25 Mar 2003, Jeff Roberson wrote: > > > Thanks to the foundation provided by Julian, David Xu, Mini, Dan Eischen, > > and everyone else who has participated with KSE and libpthread development > > Mini and I have developed a 1:1 threading implementation. This code works > > in parallel with KSE and does not break it in any way. It actually helps > > bring M:N threading closer by testing out shared bits. > > The current design was done specifically so that the 'component parts > could be recombined in different groupings to give different threading > models. This was one of the models considered when the group > discussed it. I'm glad that it is working.. Yep, that was a good design goal. > > > > I have successfully run mozilla 1.2.1 using this threading package. It > > still has some bugs and some incomplete corners but we're very close to > > being able to commit this. I'm going to post a link to the kernel portion > > of this code at the end of this mail. The library will come later. > > I wondered what was going on there.. There's been a trmendous silence in > the userland side of things. Well, I wasn't doing userland stuff until three days ago. I think mini has just been very busy with work. I suspect that you're going to need to start doing userland work or find someone to do it if you want to get it done soon. > > > > What this means is that for every pthread in an application there is one > > KSE and thread. There is also only one ksegroup per proc in this model. > > Since the kernel knows about all threads it handles all scheduling > > decisions and all signal delivery. I have followed the POSIX spec while > > implementing the signal code. I would really appreciate review from > > anyone who is intimately familiar with signals and threads. Included in > > this is an implementation of sigwait(), sigtimedwait(), and sigwaitinfo(). > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user > thread? Having one ksegrp and many KSEs requires changing the kernel > code where doing it the other way you could do it without making any > changes. I don't understand? There are relatively minor changes to the kernel to support this. Since nice is a property of the process, it makes sense that there is only one ksegrp per process. I'm starting to think that the ksegrp was overkill in general. > Specifically since My plan is to make the "KSE' structure go away.. > (by which I mean it is only going to be visible within the particular > thread_scheduler that uses it and that externally > the only structures visible would be : > proc, ksegrp(subproc?) thread and upcall. For M:N I really think this should be proc, thread, and upcall. For 1:1 I only need proc and thread. > The KSE would be allocated only by a call into the scheduler and is part > of the "scheduler specific private data". > > i.e. on creation of a new process, shced_newproc() is called > and a KSE is added in there is the scheduler in question wants to use > KSEs. If it doesn't, no KSE would be added, but it's still possible that Yes, I think we need more sched hooks here as well. Having only sched_fork() makes things sort of gross. We'll have to hook this all up later. > some scheduler specific storage might be added. In the case > of a new upcall being declared (kse_create() (to be renamed)) > sched_make_threaded() is called which adds KSEs to the KSEGRP > (I was going to change it to be called a subprocess). > KSEs are an accounting aid for the scheduler. A differnt scheduler may > decide to put threads themselves onto the run queues which would > make KSEs un-needed. (for example) > > > > > The user land mutexes are supported by kernel code. Uncontested acquires > > and releases are done entirely in application space using atomic > > instructions. Once there is contention the library falls back to system > > calls to handle the locks. There are no per lock kernel resources > > allocated. There is a user space safe atomic cmpset function that has > > been defined for x86 only at the moment. New architectures require only > > this function and the *context apis to run this threading package. There > > is no arch specific code in user space. > > This was discussed recently as being the highlight of someone's > threading model (I think Linux but I am not sure who's). Yes, linux was discussing this. It's a pretty common trick. Even NT does it but apparently NT allocates kernel resources for user locks. I was pretty pleased that I got away without any per lock allocations. > > > > The condition variables and other blocking situations are handled with > > sig*wait*() and a new signal, SIGTHR. There are many reasons that we went > > with a signal here. If anyone cares to know them, you may ask. > > > > There are only 4 system calls for threading. thr_create, thr_self, > > thr_exit, and thr_kill. The rest of the functionality is implemented in a > > library that has been heavily hacked up from the original libc_r. > > > > The reason we're doing this in parallel with the M:N effort is so that we > > can have reasonable threading sooner. As I stated before, this project is > > complimentary to KSE and does not prohibit it from working. I also think > > that the performance will be better or comparable in the majority of real > > applications. > > My only comment is that since mini is supposed to be doing the > M:N library, isn't this a bit of a distraction? I'll let him comment on this. > > > > The kernel bits are available at > > http://www.chesapeake.net/~jroberson/thr.diff > > Please explain what this means: > - mask = td->td_proc->p_sigmask; > + mask = td->td_sigmask; > > > how can you have a per thread mask? > Signals are masked for the entire process.. > How do you keep them in sync with each other? As per POSIX each thread has a signal mask. There is a per process sigaction but per thread mask and pending. This has to be the case even for M:N although some of it is hidden by the UTS. libc_r even keeps per thread pending and mask bits. > - if (p1->p_flag & P_THREADED) { > + if (p1->p_flag & P_THREADED || p1->p_numthreads > 1) { > > If you are running threads, please set the P_THREADED flag. > if you wnat do differentiate between upcalling threads and 1:1 > threads, please use some auxhilliary flag. I'd rather not have a flag. The > 1 check is used only in places where we have to suspend multiple threads or go to single threading etc. Processes in the 1:1 threading model aren't so special as they are with KSE. They don't need to be treated specially except when we're trying to funnel them down etc. > You should be creating a new KSEGRP (subproc) per thread. > I think you will find that if you do, things will fall out easier > and you won't break the next KSE changes. I don't understand what I may break? > > > > I'd like to get the signal code commited asap. It's the majority of the > > patch and I often have to resolve conflicts. There have been no > > regressions in KSE or non threaded applications with this signal code. > > I'm not against having a separate 1:1 thread capability, but > all this work could have been well spent getting M:N threads > better supported and even getting it to > be able to run in 1:1 mode a s a byproduct.. I don't think M:N is the way to go. After looking things over and considering where it is theoretically faster I do not think it is a worthwhile pursuit. First off, it is many months away from being even beta quality. I think the UTS is far more complicated than you may realize. There are all sorts of synchronization issues that it was able to avoid before since only one thread could run at any time and there essentially was no preemption. It now also has to deal with effecient scheduling decisions in a M:N model that it didn't have to worry about before. Aside from that, there are numerous problems with the kernel not being able to identify individual threads of execution. Debugging, scheduling, profiling, ktrace are all more difficult in a m:n environment. I think it is going to contribute to less effecient scheduling decisions over all. I have already wrestled with this in ULE. I feel that this is an overwhelming amount of complexity. Because of this it will be buggy. Sun claims that they still have open tickets on their M:N while their new 1:1 implementation is totally bug free. How long have they been doing m:n? I don't think that with our limited resources we're going to be able to do better. Furthermore, m:n's basic advantage is less overhead from staying out of the kernel. Also, less per thread resources. I think this is bogus for a couple of reasons. First, if your application has more threads than cpus it is written incorrectly. For people who are doing thread pools instead of event driven IO models they will encounter the same overhead with M:N as 1:1. I'm not sure what applications are entirely compute and have more threads than cpus. These are the only ones which really theoretically benefit. I don't think our threading model should be designed to optimize poorly thought out applications. Furthermore, the amount of work done per slice has been growing with processor speeds. Slice time is adjusted for user experience and so it remains constant. This means that the constraints are different from when this architecture started to come about many (10 or so?) years ago. Trying to optimize context switches between threads just doesn't make sense when you do so much work per slice. Then if you look at the number of system calls and shenanigans a UTS must do to make proper scheduling decisions it doesn't look like such an advantage. I feel that the overhead of all the layers comes close to the savings from doing some of it without entering the kernel. In short, even if it is marginally faster, it doesn't seem like it is worth the effort and risk. I don't want to discourage you from trying but this is why I stopped working on KSE proper and pursued the 1:1 model. Cheers, Jeff From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 00:37:47 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 51FEF37B404; Wed, 26 Mar 2003 00:37:47 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 87F7B43FA3; Wed, 26 Mar 2003 00:37:46 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q8bkR78633; Wed, 26 Mar 2003 03:37:46 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 26 Mar 2003 03:37:46 -0500 (EST) From: Jeff Roberson To: David Xu In-Reply-To: <00f101c2f365$8de4e530$f001a8c0@davidw2k> Message-ID: <20030326033727.Q64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-15.1 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 08:37:50 -0000 X-List-Received-Date: Wed, 26 Mar 2003 08:37:50 -0000 On Wed, 26 Mar 2003, David Xu wrote: > I am reading the code, although not very understand > your idea, but I found a problem, if a thread exits, > some signals taken by the thread will be lost even > the signal originally is not for the thread. > > David Xu You're absolutely right. Thanks. I'll fix this. Cheers, Jeff > ----- Original Message ----- > From: "Jeff Roberson" > To: > Sent: Wednesday, March 26, 2003 11:52 AM > Subject: Re: 1:1 Threading implementation. > > > > I pooched the patch. It's updated at the same web address. > > > > http://www.chesapeake.net/~jroberson/thr.diff > > > > Cheers, > > Jeff > > > > _______________________________________________ > > freebsd-arch@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 01:18:29 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 705D737B404 for ; Wed, 26 Mar 2003 01:18:29 -0800 (PST) Received: from skynet.stack.nl (skynet.stack.nl [131.155.140.225]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8B30243F93 for ; Wed, 26 Mar 2003 01:18:28 -0800 (PST) (envelope-from marcolz@stack.nl) Received: by skynet.stack.nl (Postfix, from userid 65534) id 0D43B3E32; Wed, 26 Mar 2003 10:19:01 +0100 (CET) Received: from turtle.stack.nl (turtle.stack.nl [2001:610:1108:5010::132]) by skynet.stack.nl (Postfix) with ESMTP id B85843E2D; Wed, 26 Mar 2003 10:19:00 +0100 (CET) Received: by turtle.stack.nl (Postfix, from userid 333) id DE2A11CC2D; Wed, 26 Mar 2003 10:18:26 +0100 (CET) Date: Wed, 26 Mar 2003 10:18:26 +0100 From: Marc Olzheim To: Jeff Roberson Message-ID: <20030326091826.GA79113@stack.nl> References: <20030326031245.O64602-100000@mail.chesapeake.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030326031245.O64602-100000@mail.chesapeake.net> X-Operating-System: FreeBSD turtle.stack.nl 5.0-CURRENT FreeBSD 5.0-CURRENT X-URL: http://www.stack.nl/~marcolz/ User-Agent: Mutt/1.5.4i X-Spam-Status: No, hits=-32.5 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 09:18:30 -0000 On Wed, Mar 26, 2003 at 03:36:57AM -0500, Jeff Roberson wrote: > First, if your application has more threads than cpus it is written > incorrectly. For people who are doing thread pools instead of event > driven IO models they will encounter the same overhead with M:N as 1:1. > I'm not sure what applications are entirely compute and have more threads > than cpus. These are the only ones which really theoretically benefit. I > don't think our threading model should be designed to optimize poorly > thought out applications. Might I suggest that there are 'nice' C++ ways of using thread-classes where both the usual C++ dogmas of readability and reuseability make you easily end up with more threads than cpus... I think that from a userland's point of view, most programmers shouldn't be caring less about how many cpus the machine has their core is running on. With this (not limited to) C++ model in mind, the M:N way would be a great thing to have. Zlo From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 01:23:31 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D8B8D37B404 for ; Wed, 26 Mar 2003 01:23:31 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 15BE643FA3 for ; Wed, 26 Mar 2003 01:23:31 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q9NLi96163; Wed, 26 Mar 2003 04:23:21 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 26 Mar 2003 04:23:21 -0500 (EST) From: Jeff Roberson To: Marc Olzheim In-Reply-To: <20030326091826.GA79113@stack.nl> Message-ID: <20030326042114.H64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-16.0 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 09:23:32 -0000 On Wed, 26 Mar 2003, Marc Olzheim wrote: > On Wed, Mar 26, 2003 at 03:36:57AM -0500, Jeff Roberson wrote: > > First, if your application has more threads than cpus it is written > > incorrectly. For people who are doing thread pools instead of event > > driven IO models they will encounter the same overhead with M:N as 1:1. > > I'm not sure what applications are entirely compute and have more threads > > than cpus. These are the only ones which really theoretically benefit. I > > don't think our threading model should be designed to optimize poorly > > thought out applications. > > Might I suggest that there are 'nice' C++ ways of using thread-classes > where both the usual C++ dogmas of readability and reuseability make you > easily end up with more threads than cpus... > I think that from a userland's point of view, most programmers shouldn't > be caring less about how many cpus the machine has their core is running > on. Sure, but in these cases you're not likely to be using them in performance critical code. Which means you're not likely to be using all of the cpu.. Which means you're going to have to go block in the kernel anyway. And so, really what we're talking about is wasted memory here. Not even many cpu cycles. I think people who actually care about performance don't want the M:N overhead. 1:1 will be faster for them. For the rest, well, they didn't care about performance and so why should we work so hard to make it marginally faster for them? > With this (not limited to) C++ model in mind, the M:N way would be a > great thing to have. > > Zlo > From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 01:33:59 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9542F37B404 for ; Wed, 26 Mar 2003 01:33:59 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2663843F85 for ; Wed, 26 Mar 2003 01:33:58 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h2Q9XqmF056582; Wed, 26 Mar 2003 12:33:52 +0300 (MSK) Date: Wed, 26 Mar 2003 12:33:52 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: Jeff Roberson In-Reply-To: <20030325214028.K64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.3 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@FreeBSD.ORG Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 09:34:00 -0000 On Tue, 25 Mar 2003, Jeff Roberson wrote: > I realize that many people have strong feelings on this topic. I'm asking > everyone up front to try not to devolve this thread into a bikeshed. > > Thanks to the foundation provided by Julian, David Xu, Mini, Dan Eischen, > and everyone else who has participated with KSE and libpthread development > Mini and I have developed a 1:1 threading implementation. This code works > in parallel with KSE and does not break it in any way. It actually helps > bring M:N threading closer by testing out shared bits. I'm very glad to see two kind of the kernel supported threads in FreeBSD. > The condition variables and other blocking situations are handled with > sig*wait*() and a new signal, SIGTHR. There are many reasons that we went > with a signal here. If anyone cares to know them, you may ask. I ask :) > There are only 4 system calls for threading. thr_create, thr_self, > thr_exit, and thr_kill. The rest of the functionality is implemented in a > library that has been heavily hacked up from the original libc_r. I think thr_create() should have a optional capability to create a thread's stack. This allow to save one syscall because otherwise you need to call mmap() or malloc()/sbrk() before the_create(). I think that thr_self() should be implemented in the user land. It's used in pthread_getspecific(), pthread_setspecific(), and gcc3's __thread attribute and can be used very often and should be very cheap. Solaris uses gs register on x86 and %g7 register on Sparc. Linux also uses gs register on x86, other platforms implementation details can be found here - http://people.redhat.com/drepper/tls.pdf Win32 and OS/2 use fs register. As far as I know FreeBSD 4.x uses gs to proc in a kernel and 5.x uses fs for some per-CPU data in a kernel. I think we should use one of these register to point to the thread specific data in the user land. > I'd like to get the signal code commited asap. It's the majority of the > patch and I often have to resolve conflicts. There have been no > regressions in KSE or non threaded applications with this signal code. Did this signal code supports siginfo ? FreeBSD 4.x fills zeros in the most siginfo's fileds. Igor Sysoev http://sysoev.ru/en/ From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 01:49:20 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 226DB37B404 for ; Wed, 26 Mar 2003 01:49:20 -0800 (PST) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A5EE43F3F for ; Wed, 26 Mar 2003 01:49:19 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0122.cvx21-bradley.dialup.earthlink.net ([209.179.192.122] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18y7Wt-0001PX-00; Wed, 26 Mar 2003 01:49:08 -0800 Message-ID: <3E817735.A388A41C@mindspring.com> Date: Wed, 26 Mar 2003 01:47:33 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Jeff Roberson References: <20030326031245.O64602-100000@mail.chesapeake.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4d0af189ff8cbe69aee10b1ddebe4d101350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c X-Spam-Status: No, hits=-21.7 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,QUOTE_TWICE_1, RCVD_IN_OSIRUSOFT_COM,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 09:49:22 -0000 Jeff Roberson wrote: > Well, I wasn't doing userland stuff until three days ago. I think mini > has just been very busy with work. I suspect that you're going to need > to start doing userland work or find someone to do it if you want to get > it done soon. In theory, the library API will be identical to the pthreads standard, and will require no changes to programs written to that standard. Most threaded programs these days are written to the standard. Some threaded programs make invalid assumptions about rescheduling following an involuntary context switch, or ability to make particular blocking calls. The first will still be a problem (e.g. Netscape's Java/JavaScript GIF rendering engine is probably still serializing requests due to non-thread reentrancy). The second should not be an issue, either with your implementation of the 1:1, or Jon Mini's implemenetation of the N:M model. > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user > > thread? Having one ksegrp and many KSEs requires changing the kernel > > code where doing it the other way you could do it without making any > > changes. > > I don't understand? There are relatively minor changes to the kernel to > support this. Since nice is a property of the process, it makes sense > that there is only one ksegrp per process. I'm starting to think that the > ksegrp was overkill in general. The KSEGRP is, effectively, a virtual processor interface, and was/is intended for use by the scheduler to ensure CPU affinity for individual threads, and CPU negaffinity for multiple threads within a process. In other words, according to the published design documents, it's a scheduler artifact. Personally, I've never seen the need for virtual processors, but then I've always advocated "intentional start/intentional migration" for the scheduler model (one of my arguments for a per CPU migration queue, and a push-model, rather than a pull-model for redistribution of an unbalanced load). In a scheduler model where a sheduler *pulls* work, either from another CPU ready-to-run queue, or from a single ready-to-run queue that is global to the system (in either case, requiring locks in the scheduler path, potentially highly contended locks), the idea of a KSEGRP/"virtual processor" is necessary for globally migratable and contendable "bookkeeping" objects. So in the current scheduler implementations, KSEGRP is necessary; in the 1:1 model, it's necessary, if only to ensure negaffinity (4 CPU system, process with 4 threads, ensure each thread gets its own CPU, and does not migrate away from it). It's also minorly useful to distinguish PTHREAD_SCOPE_SYSTEM priority groups, when running multiple threads on a single CPU (either in the common single CPU case, or in the less common SMP case), as a means of avoiding priority inversion deadlocks. I would like to see this done differently, which would get rid of KSEGRP, but would add a scheduler architecture dependency, which I think can't be gotten rid of easily. It's a tradeoff (as usual). > > i.e. on creation of a new process, shced_newproc() is called > > and a KSE is added in there is the scheduler in question wants to use > > KSEs. If it doesn't, no KSE would be added, but it's still possible that > > Yes, I think we need more sched hooks here as well. Having only > sched_fork() makes things sort of gross. We'll have to hook this all up > later. You could also take this idea much further. Specifically, SVR4 flags system calls as "non-blocking", "blocking", and "potentially blocking". By doing this, they can lazy-bind context creation for blocking operations on "blocking" and "potentially blocking" calls, and avoid it altogether on "non-blocking" and sometimes avoid it on "potentially blocking" calls. This can result in a significant overhead savings, if the kernel implementation evolves, but the user space implementation remains fixed. It's good to decouple these things from each other (IMO). > > This was discussed recently as being the highlight of someone's > > threading model (I think Linux but I am not sure who's). > > Yes, linux was discussing this. It's a pretty common trick. Even NT does > it but apparently NT allocates kernel resources for user locks. I was > pretty pleased that I got away without any per lock allocations. Everyone does this. Novell did it back in 1993. Sun's turnstiles are based on the tradeoff between spinning and waiting, and how many times you have to do that before it's worth crossing the protection domain, and blocking. When we did this in 1993 (Novell's implementation was primarily by Dave Hefner, who now works for Microsoft, I believe), we ended up with 20,000 times the transcation per second performance of Tuxedo, which was the commercial record holder up to that point. > > > The reason we're doing this in parallel with the M:N effort is so that we > > > can have reasonable threading sooner. As I stated before, this project is > > > complimentary to KSE and does not prohibit it from working. I also think > > > that the performance will be better or comparable in the majority of real > > > applications. > > > > My only comment is that since mini is supposed to be doing the > > M:N library, isn't this a bit of a distraction? > > I'll let him comment on this. I'll stick my nose in: I think it's a good idea, since TPTB have recently made noises on a couple of FreeBSD lists about "rapidly approaching deadlines for the KSE work". Consider it insurance on your investment, people. > > You should be creating a new KSEGRP (subproc) per thread. > > I think you will find that if you do, things will fall out easier > > and you won't break the next KSE changes. > > I don't understand what I may break? See above for KSEGRP reasoning. I think it's representative, but, if you have time, you may want to read the documentation for the KSE project. If other people want to comment or correct my own comments in this regard (I have been largely an observer, since after the second threads meeting where my async call gate idea was brutally .. uh, "laid to rest" ;^)), they should feel free to do so. > > I'm not against having a separate 1:1 thread capability, but > > all this work could have been well spent getting M:N threads > > better supported and even getting it to > > be able to run in 1:1 mode a s a byproduct.. > > I don't think M:N is the way to go. After looking things over and > considering where it is theoretically faster I do not think it is a > worthwhile pursuit. > > First off, it is many months away from being even beta quality. I think > the UTS is far more complicated than you may realize. There are all sorts > of synchronization issues that it was able to avoid before since only one > thread could run at any time and there essentially was no preemption. It > now also has to deal with effecient scheduling decisions in a M:N model > that it didn't have to worry about before. I would not recommend abandoning the idea, personally. There is a huge -- and I mean *huge* -- amount of literature that likes the N:M model. There is also the fact that affinity and quantum are very hard to maintain on a system with a heterogeneous load. In other words, 1:1 looks good if the only thing you are running is a single multithreaded proces, but looks a *lot* less good when you start running real-world code instead of fictitious benchmarks that try to make your threading look good (e.g. measuring only thread context switches, with no process context switch stall barriers, etc.). > I feel that this is an overwhelming amount of complexity. Because of this > it will be buggy. Sun claims that they still have open tickets on their > M:N while their new 1:1 implementation is totally bug free. How long have > they been doing m:n? I don't think that with our limited resources we're > going to be able to do better. You can't schedule resources. They will work on what they want to, and let anything they don't like just sit there and "rot". The Sun claims are really specious, IMO. They have it working, but how does it compare to, say multiple processes that are sharing descriptor tables, and not much else, in a work-to-do model? I can tell you from personal experience with such a model, that it *VASTLY* outperforms a 1:1 kernel threading model, even if you end up running multiple state-machine instances on multiple CPUs. We got more than a 120X increase in NetWare for UNIX, simply by changing the client dispatch streams MUX to dispatch to worker processes instead of threads, in LIFO instead of FIFO order, simply because it ensured that the process pages you cared about were more likely to be in core. 1:1 threading is useful for one thing, and one thing only: SMP scalability of single image processes. And it's not the best at doing that. > Furthermore, m:n's basic advantage is less overhead from staying out of > the kernel. No, actually, it's the ability to fully utilize a quantum, and to not have to make a decision between one of your own threads and some other process, off the run queue, when making a decision in the scheduler about what to run next. If you favor the threads in your own process, then you potentially starve other processes. If you favor neither, and treat them like processes, you get none of the supposed context switch benefits that were supposedly going to result from using threads instead of processes in the first place. > First, if your application has more threads than cpus it is written > incorrectly. This depends on what those threads are doing. If they are all doing the same work, then yes, you are right. If they are doing different jobs, then you are wrong; even if most of them are doing the same job, and a few of them are doing different jobs, you are still wrong, since job-loading is unlikely to be the same between threads. > For people who are doing thread pools instead of event driven IO > models they will encounter the same overhead with M:N as 1:1. This is actually false. In 1:1, your thread competes with all other processes, in order to be the next at the top of the run queue. Statitically, you are doing more TLB flushes and shootdowns, and more L1 and L2 cache chootdowns, than you would otherwise. Solving this problem without intentional scheduling has been proben to be N-P incomplete: it is not a problem which is solvable in polonomyial time. > I'm not sure what applications are entirely compute and have more threads > than cpus. These are the only ones which really theoretically benefit. I > don't think our threading model should be designed to optimize poorly > thought out applications. By that argument, threads should not be supported at all... 8-) 8-). > This means that the constraints are different from when > this architecture started to come about many (10 or so?) years ago. > Trying to optimize context switches between threads just doesn't make > sense when you do so much work per slice. 5/6 years ago, depending on who you ask. But by your same arguments, CPU clock multipliers have grown to the point that memory bus and I/O bus stalls are so expensive that SMP makes no sense. > Then if you look at the number of system calls and shenanigans a UTS must > do to make proper scheduling decisions it doesn't look like such an > advantage. I agree with this one; my original model avoided the problem entirely by making the POSIX blocking call behaviour a library on to of an sync kernel interface. By doing this, kernel boundary crossings could be minimized automatically. The pthreads code as it has existed so far, also does a lot of unecessary kernel boundary crossings in order to handle signal masking. In fact, you could establish an intermediate handler for all signals at the user threads scheduler level, and never have to worry about most of that crap. I think the kernel boundary crossing overhead, and the fact that, in doing so, you tend to relinquish a significant fraction of remaining quantum (by your own arguments) says that protection domain crossings are to be avoided at all costs. > In short, even if it is marginally faster, it doesn't seem like it is > worth the effort and risk. I don't want to discourage you from trying but > this is why I stopped working on KSE proper and pursued the 1:1 model. I'm glad you pursued it, even though I do not agree with your reasoning on the value of N:M vs. 1:1. I view it as "life insurance" for the KSE code, which some people might be otherwise tempted to rip out over some arbitrary deadline. Thank you for your work here, and thank everyone else for their work, too. -- Terry From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 02:45:21 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 33C1F37B404 for ; Wed, 26 Mar 2003 02:45:21 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id 704EA43F75 for ; Wed, 26 Mar 2003 02:45:19 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h2QAj2mF058555; Wed, 26 Mar 2003 13:45:02 +0300 (MSK) Date: Wed, 26 Mar 2003 13:45:02 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: Jeff Roberson In-Reply-To: <20030326031245.O64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.8 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 10:45:22 -0000 On Wed, 26 Mar 2003, Jeff Roberson wrote: > > > What this means is that for every pthread in an application there is one > > > KSE and thread. There is also only one ksegroup per proc in this model. > > > Since the kernel knows about all threads it handles all scheduling > > > decisions and all signal delivery. I have followed the POSIX spec while > > > implementing the signal code. I would really appreciate review from > > > anyone who is intimately familiar with signals and threads. Included in > > > this is an implementation of sigwait(), sigtimedwait(), and sigwaitinfo(). > > > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user > > thread? Having one ksegrp and many KSEs requires changing the kernel > > code where doing it the other way you could do it without making any > > changes. > > I don't understand? There are relatively minor changes to the kernel to > support this. Since nice is a property of the process, it makes sense > that there is only one ksegrp per process. I'm starting to think that the > ksegrp was overkill in general. As I understand all KSEs in one KSEGRP have the same priority. If you need several thread priority inside a process you need several KSEGRPs so Julian's suggestion is better. As far as I know KSEGRP has two orthogonal features: 1) it limits number of KSEs to number of CPU; 2) and it set KSE priority. Igor Sysoev http://sysoev.ru/en/ From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 02:58:00 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EE33637B404 for ; Wed, 26 Mar 2003 02:58:00 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id E298943F3F for ; Wed, 26 Mar 2003 02:57:59 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2QAvtE36962; Wed, 26 Mar 2003 05:57:55 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Wed, 26 Mar 2003 05:57:55 -0500 (EST) From: Jeff Roberson To: Terry Lambert In-Reply-To: <3E817735.A388A41C@mindspring.com> Message-ID: <20030326053115.T64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-16.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 10:58:03 -0000 On Wed, 26 Mar 2003, Terry Lambert wrote: > Jeff Roberson wrote: > > Well, I wasn't doing userland stuff until three days ago. I think mini > > has just been very busy with work. I suspect that you're going to need > > to start doing userland work or find someone to do it if you want to get > > it done soon. > > In theory, the library API will be identical to the pthreads > standard, and will require no changes to programs written to > that standard. Most threaded programs these days are written > to the standard. Some threaded programs make invalid assumptions > about rescheduling following an involuntary context switch, or > ability to make particular blocking calls. I'm not sure what API compatibility has to do with anything? > The first will still be a problem (e.g. Netscape's Java/JavaScript > GIF rendering engine is probably still serializing requests due to > non-thread reentrancy). > > The second should not be an issue, either with your implementation > of the 1:1, or Jon Mini's implemenetation of the N:M model. I'm not sure I know what you're talking about. Blocking calls are either handled by an upcall in M:N or by having independent contexts in 1:1. > > > > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user > > > thread? Having one ksegrp and many KSEs requires changing the kernel > > > code where doing it the other way you could do it without making any > > > changes. > > > > I don't understand? There are relatively minor changes to the kernel to > > support this. Since nice is a property of the process, it makes sense > > that there is only one ksegrp per process. I'm starting to think that the > > ksegrp was overkill in general. > > The KSEGRP is, effectively, a virtual processor interface, and > was/is intended for use by the scheduler to ensure CPU affinity > for individual threads, and CPU negaffinity for multiple threads > within a process. In other words, according to the published > design documents, it's a scheduler artifact. This is the KSE, not the KSE group. The KSE Group was intended to allow multiple groups of threads with different scheduling algorithms or different base priorities (nice). > Personally, I've never seen the need for virtual processors, but > then I've always advocated "intentional start/intentional migration" > for the scheduler model (one of my arguments for a per CPU migration > queue, and a push-model, rather than a pull-model for redistribution > of an unbalanced load). The push model suffers from as much as a one tick latency in any migration. In many cases probably more than that. The overhead from locking queues is far out weighed by the cases where your cpu is sitting idle due to an unbalanced load. The latency is one tick because each cpu would have to poll the load of other cpus at some interval to discover that it is out of balance. Or it could check if it had more than one process on the run queue, which seems a bit silly. Regardless, you're probably only going to get to make this decisions once a tick which means the other cpu(s) can sit idle for at least that long. Consider a buildworld -j8. You have many processes rapidly stoping and starting. Without a pull a cpu that was very loaded could suddenly end up with no running processes and have to idle until the other gave it work. This imbalance is likely to go back in forth, I have observed this personally when writing ULE. I think you need both push and pull. The pull satisfies the case where you have short lived but rapidly reappearing processes. The push solves more long term load imbalance issues. If you have, for example, many apache processes that are very busy. no cpu will go idle, so pull is ineffective, but they may still be imbalanced. This is still missing from ULE. > In a scheduler model where a sheduler *pulls* work, either from > another CPU ready-to-run queue, or from a single ready-to-run > queue that is global to the system (in either case, requiring > locks in the scheduler path, potentially highly contended locks), > the idea of a KSEGRP/"virtual processor" is necessary for globally > migratable and contendable "bookkeeping" objects. They should only be contended when cpus have nothing to do. A worthwhile tradeoff I'd say. > So in the current scheduler implementations, KSEGRP is necessary; > in the 1:1 model, it's necessary, if only to ensure negaffinity > (4 CPU system, process with 4 threads, ensure each thread gets its > own CPU, and does not migrate away from it). You're talking about the KSE again. I think CPU affinity has little to do with the M:N or 1:1 choice except that it is much more difficult to achieve CPU affinity when you have to make a multitiered scheduling decision. To get real affinity in M:N you need kse to cpu affinity and thread to kse affinity. You also the need userland thread to kernel thread affinity, or at least user land thread to KSE affinity. > It's also minorly useful to distinguish PTHREAD_SCOPE_SYSTEM > priority groups, when running multiple threads on a single CPU > (either in the common single CPU case, or in the less common SMP > case), as a means of avoiding priority inversion deadlocks. I > would like to see this done differently, which would get rid of > KSEGRP, but would add a scheduler architecture dependency, which > I think can't be gotten rid of easily. It's a tradeoff (as usual). > > > > > i.e. on creation of a new process, shced_newproc() is called > > > and a KSE is added in there is the scheduler in question wants to use > > > KSEs. If it doesn't, no KSE would be added, but it's still possible that > > > > Yes, I think we need more sched hooks here as well. Having only > > sched_fork() makes things sort of gross. We'll have to hook this all up > > later. > > > You could also take this idea much further. Specifically, SVR4 > flags system calls as "non-blocking", "blocking", and "potentially > blocking". By doing this, they can lazy-bind context creation for > blocking operations on "blocking" and "potentially blocking" calls, > and avoid it altogether on "non-blocking" and sometimes avoid it on > "potentially blocking" calls. KSE already does better than this by only creating a new context when you actually block. The upcall mechanism specifically addresses that need. This is seperate from what we were discussing above which is allowing the scheduler to have a chance to initialize data when a new context is created. > This can result in a significant overhead savings, if the kernel > implementation evolves, but the user space implementation remains > fixed. > > It's good to decouple these things from each other (IMO). Which things? > > > > This was discussed recently as being the highlight of someone's > > > threading model (I think Linux but I am not sure who's). > > > > Yes, linux was discussing this. It's a pretty common trick. Even NT does > > it but apparently NT allocates kernel resources for user locks. I was > > pretty pleased that I got away without any per lock allocations. > > Everyone does this. Novell did it back in 1993. Sun's turnstiles > are based on the tradeoff between spinning and waiting, and how > many times you have to do that before it's worth crossing the > protection domain, and blocking. I think you mean sun's adaptive mutexes. The turnstile is just the queue that you block on if I'm remembering correctly. The blocking queue I used for umtx is a similar context where the queue migrates among the blocking threads. > When we did this in 1993 (Novell's implementation was primarily > by Dave Hefner, who now works for Microsoft, I believe), we ended Any relation to hugh? > up with 20,000 times the transcation per second performance of > Tuxedo, which was the commercial record holder up to that point. Sounds good. > > > > > The reason we're doing this in parallel with the M:N effort is so that we > > > > can have reasonable threading sooner. As I stated before, this project is > > > > complimentary to KSE and does not prohibit it from working. I also think > > > > that the performance will be better or comparable in the majority of real > > > > applications. > > > > > > My only comment is that since mini is supposed to be doing the > > > M:N library, isn't this a bit of a distraction? > > > > I'll let him comment on this. > > I'll stick my nose in: I think it's a good idea, since TPTB have > recently made noises on a couple of FreeBSD lists about "rapidly > approaching deadlines for the KSE work". > > Consider it insurance on your investment, people. Yes, it isn't necessarily a KSE replacement. > > > > You should be creating a new KSEGRP (subproc) per thread. > > > I think you will find that if you do, things will fall out easier > > > and you won't break the next KSE changes. > > > > I don't understand what I may break? > > See above for KSEGRP reasoning. I think it's representative, > but, if you have time, you may want to read the documentation > for the KSE project. If other people want to comment or correct > my own comments in this regard (I have been largely an observer, > since after the second threads meeting where my async call gate > idea was brutally .. uh, "laid to rest" ;^)), they should feel > free to do so. > > > > > I'm not against having a separate 1:1 thread capability, but > > > all this work could have been well spent getting M:N threads > > > better supported and even getting it to > > > be able to run in 1:1 mode a s a byproduct.. > > > > I don't think M:N is the way to go. After looking things over and > > considering where it is theoretically faster I do not think it is a > > worthwhile pursuit. > > > > First off, it is many months away from being even beta quality. I think > > the UTS is far more complicated than you may realize. There are all sorts > > of synchronization issues that it was able to avoid before since only one > > thread could run at any time and there essentially was no preemption. It > > now also has to deal with effecient scheduling decisions in a M:N model > > that it didn't have to worry about before. > > I would not recommend abandoning the idea, personally. There is a > huge -- and I mean *huge* -- amount of literature that likes the > N:M model. > > There is also the fact that affinity and quantum are very hard to > maintain on a system with a heterogeneous load. In other words, > 1:1 looks good if the only thing you are running is a single > multithreaded proces, but looks a *lot* less good when you start > running real-world code instead of fictitious benchmarks that > try to make your threading look good (e.g. measuring only thread > context switches, with no process context switch stall barriers, > etc.). Yes, I see what you're getting at. M:N allows you to keep running until you've exhausted your whole slice by selecting another thread. You could acomplish this in 1:1 by loaning your slice to the next available thread that was bound to the same cpu and force a switch to that. That's a neat idea. I'll have to look into this for ule. > > > I feel that this is an overwhelming amount of complexity. Because of this > > it will be buggy. Sun claims that they still have open tickets on their > > M:N while their new 1:1 implementation is totally bug free. How long have > > they been doing m:n? I don't think that with our limited resources we're > > going to be able to do better. > > You can't schedule resources. They will work on what they want > to, and let anything they don't like just sit there and "rot". > > The Sun claims are really specious, IMO. They have it working, > but how does it compare to, say multiple processes that are > sharing descriptor tables, and not much else, in a work-to-do > model? > > I can tell you from personal experience with such a model, that > it *VASTLY* outperforms a 1:1 kernel threading model, even if you > end up running multiple state-machine instances on multiple CPUs. > We got more than a 120X increase in NetWare for UNIX, simply by > changing the client dispatch streams MUX to dispatch to worker > processes instead of threads, in LIFO instead of FIFO order, > simply because it ensured that the process pages you cared about > were more likely to be in core. Yeah, the LIFO trick is widely used. I believe apache does something of this sort. It's also discussed on the c10k problem page. I'm not sure why you got better perf out of processes than threads though. This is sort of confusing. > 1:1 threading is useful for one thing, and one thing only: SMP > scalability of single image processes. And it's not the best at > doing that. It's also good at providing extra contexts to block on for IO worker threads. Furthermore, It's really good at being implemented quickly, which is especially important considering that it's 2003 and we don't have kernel supported threads... > > Furthermore, m:n's basic advantage is less overhead from staying out of > > the kernel. > > No, actually, it's the ability to fully utilize a quantum, and > to not have to make a decision between one of your own threads > and some other process, off the run queue, when making a decision > in the scheduler about what to run next. Yeah, I just remembered this bit. See my answer above. I think I'll do this trick in ULE. > If you favor the threads in your own process, then you potentially > starve other processes. > > If you favor neither, and treat them like processes, you get none > of the supposed context switch benefits that were supposedly going > to result from using threads instead of processes in the first place. > > > > First, if your application has more threads than cpus it is written > > incorrectly. > > This depends on what those threads are doing. If they are all doing > the same work, then yes, you are right. If they are doing different > jobs, then you are wrong; even if most of them are doing the same job, > and a few of them are doing different jobs, you are still wrong, since > job-loading is unlikely to be the same between threads. > > > > For people who are doing thread pools instead of event driven IO > > models they will encounter the same overhead with M:N as 1:1. > > This is actually false. In 1:1, your thread competes with all > other processes, in order to be the next at the top of the run > queue. Statitically, you are doing more TLB flushes and shootdowns, > and more L1 and L2 cache chootdowns, than you would otherwise. This is the same argument about using your whole slice eh? > Solving this problem without intentional scheduling has been > proben to be N-P incomplete: it is not a problem which is > solvable in polonomyial time. eh? Which problem is NP? > > > I'm not sure what applications are entirely compute and have more threads > > than cpus. These are the only ones which really theoretically benefit. I > > don't think our threading model should be designed to optimize poorly > > thought out applications. > > By that argument, threads should not be supported at all... 8-) 8-). I meant to say 'entirely compute bound'. If you just want CPU and no IO then you probably only want as many threads as processors. This is the most effecient arrangement. I'm not arguing against threads although I do think they are often abused. > > > This means that the constraints are different from when > > this architecture started to come about many (10 or so?) years ago. > > Trying to optimize context switches between threads just doesn't make > > sense when you do so much work per slice. > > 5/6 years ago, depending on who you ask. > > But by your same arguments, CPU clock multipliers have grown > to the point that memory bus and I/O bus stalls are so > expensive that SMP makes no sense. I migh agree with you there. > > > Then if you look at the number of system calls and shenanigans a UTS must > > do to make proper scheduling decisions it doesn't look like such an > > advantage. > > I agree with this one; my original model avoided the problem > entirely by making the POSIX blocking call behaviour a library > on to of an sync kernel interface. By doing this, kernel > boundary crossings could be minimized automatically. > > The pthreads code as it has existed so far, also does a lot of > unecessary kernel boundary crossings in order to handle signal > masking. In fact, you could establish an intermediate handler > for all signals at the user threads scheduler level, and never > have to worry about most of that crap. > > I think the kernel boundary crossing overhead, and the fact > that, in doing so, you tend to relinquish a significant > fraction of remaining quantum (by your own arguments) says > that protection domain crossings are to be avoided at all costs. Yes, I agree, and without serious tweaking our current M:N significantly increases the number of system calls. > > > In short, even if it is marginally faster, it doesn't seem like it is > > worth the effort and risk. I don't want to discourage you from trying but > > this is why I stopped working on KSE proper and pursued the 1:1 model. > > I'm glad you pursued it, even though I do not agree with your > reasoning on the value of N:M vs. 1:1. I view it as "life > insurance" for the KSE code, which some people might be > otherwise tempted to rip out over some arbitrary deadline. > > Thank you for your work here, and thank everyone else for > their work, too. > > -- Terry > Thanks for the feedback. It has been stimulating. I still need to consider multithreading implications of 1:1 for ULE. This has given me a bit more to work on there. Cheers, Jeff From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 06:31:22 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 40E9437B404 for ; Wed, 26 Mar 2003 06:31:22 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0532743F75 for ; Wed, 26 Mar 2003 06:31:21 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.8/8.12.8) with SMTP id h2QEVDjK017662; Wed, 26 Mar 2003 09:31:14 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Wed, 26 Mar 2003 09:31:13 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Jeff Roberson In-Reply-To: <20030325214028.K64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.3 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 14:31:23 -0000 On Tue, 25 Mar 2003, Jeff Roberson wrote: > Thanks to the foundation provided by Julian, David Xu, Mini, Dan > Eischen, and everyone else who has participated with KSE and libpthread > development Mini and I have developed a 1:1 threading implementation. > This code works in parallel with KSE and does not break it in any way. > It actually helps bring M:N threading closer by testing out shared bits. My feeling is that this is an excellent strategy to get us productionable kernel-supported threads for the upcoming 5.x release while permitting continued R&D (and I think it is R&D) into the M:N threading possibilities. One nice thing about this construction is that the cost was very low given the existing investment in KSE, yet the payoff is very high. And it will provide a nice migration path when KSE is productionable for sites interested in doing that: thread-reliant applications will no longer be explicitly linked against a non-native threading package (linuxthreads), which is the status quo for large threaded applications on FreeBSD right now. So it seems to me that a relatively straight-forward strategy gets things moving: - Get review, testing, and commit this work in short order, and get the native threaded support in use. This will improve support for applications like Apache2, MySQL, Open Office, Mozzila, etc, with an immediate impact on performance, interactiveity, and throughput for these systems, especially for disk I/O intensive activities. Getting it in faster will dramatically increase the chances of fully productionable native threads for FreeBSD 5.1. - We'll also be able to get services like threaded debugging, etc, up more easily with this model in the short term, as well as learn a lot more about their interactions with threads and what the desired semantics are. This work should have a pay-off for M:N threads easily as well. - Allow the libkse work to continue over the longer term, and make it easier to "plug and play" threading since large threaded apps can use either library trivially through library renaming. The exposed API is presumably POSIX, and the ABI to the application should be identical. Any test suites working at the pthreads layer should also immediately carry over. And we've gained some expertise. :-) I think one important thing this will address, and Terry has alluded to it, is the perception that higher performance threading support is stalled, and therefore standing in the way of other work. We have consumers today who desperately need improved threading support: and they will benefit a lot from 1:1 in the short term. They may well benefit more from M:N in the long term, but I agree that I've had similar concerns about the scope of the userspace work remaining to be done, especially from my conversations with Jon Mini. We may have underestimated this task substantially; while it could be it falls out naturally, Terry's notion of "an insurance policy" is far from a bad one. And since this doesn't impede KSE (and builds so nicely off of the substantial KSE investment), the trade-off seems good. Thanks (to you, but also to Julian, David, and everyone else who has invested so much into KSE!) Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 08:52:33 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2FE5A37B404 for ; Wed, 26 Mar 2003 08:52:33 -0800 (PST) Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7936743FBF for ; Wed, 26 Mar 2003 08:52:32 -0800 (PST) (envelope-from wes@softweyr.com) Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id E824F431BF; Wed, 26 Mar 2003 08:52:29 -0800 (PST) From: Wes Peters Organization: Softweyr To: Poul-Henning Kamp , arch@freebsd.org Date: Wed, 26 Mar 2003 08:52:25 -0800 User-Agent: KMail/1.5 References: <5238.1048510775@critter.freebsd.dk> In-Reply-To: <5238.1048510775@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303260852.25978.wes@softweyr.com> X-Spam-Status: No, hits=-26.1 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: moving GEOM around... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 16:52:34 -0000 On Monday 24 March 2003 04:59, Poul-Henning Kamp wrote: > A number of people have suggested that the directory layout of GEOM > sources should be changed. The main complaint seems to be that > sys/geom contains both subdirectories (bde) and source files. > > I personally don't particularly care about that, and as a matter > of fact wasn't even aware that was a rule, but if a significant > number of people think this is wrong I'm willing to repo-copy things > around and fix it, therefore this strawpoll: > > Option 1: No change I'll take door number 1, Mr. Kamp. -- Where am I, and what am I doing in this handbasket? Wes Peters wes@softweyr.com From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 08:58:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3571437B404; Wed, 26 Mar 2003 08:58:40 -0800 (PST) Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8179C43F93; Wed, 26 Mar 2003 08:58:39 -0800 (PST) (envelope-from wes@softweyr.com) Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id 7ADF6436D4; Wed, 26 Mar 2003 08:58:38 -0800 (PST) From: Wes Peters Organization: Softweyr To: John Baldwin Date: Wed, 26 Mar 2003 08:58:37 -0800 User-Agent: KMail/1.5 References: In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303260858.37039.wes@softweyr.com> X-Spam-Status: No, hits=-26.1 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: Poul-Henning Kamp cc: freebsd-arch@freebsd.org Subject: Re: Patch to protect process from pageout killing X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 16:58:42 -0000 On Tuesday 25 March 2003 08:34, John Baldwin wrote: > On 25-Mar-2003 Wes Peters wrote: > > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote: > >> Also, doesn't this result in the flag being inerited with fork() and > >> thereby negating the effect you are seeking for squid ? > > > > I looked through all the places in kern_fork.c where p2->p_flag gets > > set and didn't see anything that looked like it would inherit > > P_PROTECTED from p1->p_flag. Did I miss something? I'm obviously a > > bit of a neophyte in this part of the kernel. > > rlimit's are inherited. However, due to a "feature" bug in your patch, > the P_PROTECTED flag doesn't get turned on when the rlimit is inherited > in fork1(). feature bug? If you mean the fact that the setting for P_PROTECTED isn't stored in the rlimit, that was intentional. rlimits are inherited and I specifically didn't want that behavior, similar to p_cpulimit. I still agree resource limits are not an ideal interface to use for this, I'll look further. -- Where am I, and what am I doing in this handbasket? Wes Peters wes@softweyr.com From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 09:13:32 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5632B37B408 for ; Wed, 26 Mar 2003 09:13:32 -0800 (PST) Received: from mail.speakeasy.net (mail14.speakeasy.net [216.254.0.214]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6870643F85 for ; Wed, 26 Mar 2003 09:13:31 -0800 (PST) (envelope-from jhb@FreeBSD.org) Received: (qmail 24290 invoked from network); 26 Mar 2003 17:13:35 -0000 Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63]) (envelope-sender )encrypted SMTP for ; 26 Mar 2003 17:13:35 -0000 Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1]) by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h2QHDPOv099395; Wed, 26 Mar 2003 12:13:26 -0500 (EST) (envelope-from jhb@FreeBSD.org) Message-ID: X-Mailer: XFMail 1.5.4 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200303260858.37039.wes@softweyr.com> Date: Wed, 26 Mar 2003 12:13:25 -0500 (EST) From: John Baldwin To: Wes Peters X-Spam-Status: No, hits=-19.5 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: Poul-Henning Kamp cc: freebsd-arch@freebsd.org Subject: Re: Patch to protect process from pageout killing X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 17:13:33 -0000 On 26-Mar-2003 Wes Peters wrote: > On Tuesday 25 March 2003 08:34, John Baldwin wrote: >> On 25-Mar-2003 Wes Peters wrote: >> > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote: >> >> Also, doesn't this result in the flag being inerited with fork() and >> >> thereby negating the effect you are seeking for squid ? >> > >> > I looked through all the places in kern_fork.c where p2->p_flag gets >> > set and didn't see anything that looked like it would inherit >> > P_PROTECTED from p1->p_flag. Did I miss something? I'm obviously a >> > bit of a neophyte in this part of the kernel. >> >> rlimit's are inherited. However, due to a "feature" bug in your patch, >> the P_PROTECTED flag doesn't get turned on when the rlimit is inherited >> in fork1(). > > feature bug? If you mean the fact that the setting for P_PROTECTED isn't > stored in the rlimit, that was intentional. rlimits are inherited and I > specifically didn't want that behavior, similar to p_cpulimit. I still > agree resource limits are not an ideal interface to use for this, I'll > look further. I mean that you should be setting P_PROTECTED in fork() based on the inherited rlimit's since otherwise the value of the rlimit is out of sync with the P_PROTECTED flag. Hence a bug. However, since non- inheritance is the desired behavior, it is also a feature, hence "feature" bug. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 09:20:12 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 35C0937B404; Wed, 26 Mar 2003 09:20:12 -0800 (PST) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3210C43F93; Wed, 26 Mar 2003 09:20:11 -0800 (PST) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h2QHJIl02754; Wed, 26 Mar 2003 09:19:18 -0800 Received: from btc.btc.adaptec.com (btc.btc.adaptec.com [10.100.0.52]) by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id JAA00985; Wed, 26 Mar 2003 09:20:04 -0800 (PST) Received: from btc.adaptec.com (hollin [10.100.253.56]) by btc.btc.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id KAA08644; Wed, 26 Mar 2003 10:20:00 -0700 (MST) Message-ID: <3E81E142.3040907@btc.adaptec.com> Date: Wed, 26 Mar 2003 10:20:02 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2.1) Gecko/20030206 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Robert Watson References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-31.9 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 17:20:13 -0000 Robert Watson wrote: > On Tue, 25 Mar 2003, Jeff Roberson wrote: > > >>Thanks to the foundation provided by Julian, David Xu, Mini, Dan >>Eischen, and everyone else who has participated with KSE and libpthread >>development Mini and I have developed a 1:1 threading implementation. >>This code works in parallel with KSE and does not break it in any way. >>It actually helps bring M:N threading closer by testing out shared bits. > > > My feeling is that this is an excellent strategy to get us productionable > kernel-supported threads for the upcoming 5.x release while permitting > continued R&D (and I think it is R&D) into the M:N threading > possibilities. One nice thing about this construction is that the cost > was very low given the existing investment in KSE, yet the payoff is very > high. And it will provide a nice migration path when KSE is > productionable for sites interested in doing that: thread-reliant > applications will no longer be explicitly linked against a non-native > threading package (linuxthreads), which is the status quo for large > threaded applications on FreeBSD right now. So it seems to me that a > relatively straight-forward strategy gets things moving: > [...] I'd like to add a big 'Me too' here also. 1:1 gives us an excellent milestone towards having KSE work for 5-STABLE. The June 30 deadline for KSE has been quickly approaching, and this work achieves all of the minimum objectives that we were aiming for by that date. I see this as a win-win for everyone; application threading is vastly improved, the existing KSE work gets real-world testing/exposure/validation, and the M:N work can now procede without any of the pressure and stress that it had before. In the spirit that FreeBSD is as much for research as it is for production, it's important to remember that M:N should be kept around and a research project through the RELENG_5 branch and into 6-CURRENT. Once it is stable and proven, we can look at backporting it into 5-STABLE. Overall, I'm incredibly pleased by this work! This is a major milestone for 5-STABLE, and one that will make it a worthwhile branch. Scott From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 10:52:25 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3AC9737B404 for ; Wed, 26 Mar 2003 10:52:25 -0800 (PST) Received: from net1.gendyn.com (gate1.gendyn.com [204.60.171.22]) by mx1.FreeBSD.org (Postfix) with ESMTP id 784D943F93 for ; Wed, 26 Mar 2003 10:52:24 -0800 (PST) (envelope-from eischen@vigrid.com) Received: from [153.11.11.3] (helo=ebnext01) by net1.gendyn.com with esmtp (Exim 2.12 #1) id 18yG0T-0000yx-00 for arch@FreeBSD.org; Wed, 26 Mar 2003 13:52:13 -0500 Received: from clcrtr.gdeb.com ([153.11.109.11]) by ebnext01 with SMTP id h2QIqAt8022990; Wed, 26 Mar 2003 13:52:10 -0500 Received: from vigrid.com (gpz.clc.gdeb.com [192.168.3.12]) by clcrtr.gdeb.com (8.11.4/8.11.4) with ESMTP id h2O32Bq03378; Sun, 23 Mar 2003 22:02:22 -0500 (EST) (envelope-from eischen@vigrid.com) Sender: eghk@clcrtr.gdeb.com Message-ID: <3E81F6BB.BFFE3F33@vigrid.com> Date: Wed, 26 Mar 2003 13:51:39 -0500 From: Daniel Eischen X-Mailer: Mozilla 4.78 [en] (X11; U; SunOS 5.9 sun4u) X-Accept-Language: en MIME-Version: 1.0 To: arch@FreeBSD.org Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=0.0 required=5.0 tests=none version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: kse@elischer.org Subject: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 18:52:27 -0000 Is there a good reason for providing static libraries for libpthread/libkse? I'd like to not support them to get rid of some hacks to make sure certain symbols are present in the static library case. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:06:22 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D25F37B404 for ; Wed, 26 Mar 2003 11:06:22 -0800 (PST) Received: from gw.nectar.cc (gw.nectar.cc [208.42.49.153]) by mx1.FreeBSD.org (Postfix) with ESMTP id F19F643F3F for ; Wed, 26 Mar 2003 11:06:21 -0800 (PST) (envelope-from nectar@celabo.org) Received: from madman.celabo.org (madman.celabo.org [10.0.1.111]) by gw.nectar.cc (Postfix) with ESMTP id 7214451; Wed, 26 Mar 2003 13:06:21 -0600 (CST) Received: by madman.celabo.org (Postfix, from userid 1001) id 5A75278C43; Wed, 26 Mar 2003 13:06:21 -0600 (CST) Date: Wed, 26 Mar 2003 13:06:21 -0600 From: "Jacques A. Vidrine" To: Daniel Eischen Message-ID: <20030326190621.GB34946@madman.celabo.org> References: <3E81F6BB.BFFE3F33@vigrid.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E81F6BB.BFFE3F33@vigrid.com> X-Url: http://www.celabo.org/ User-Agent: Mutt/1.5.3i-ja.1 X-Spam-Status: No, hits=-30.5 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@FreeBSD.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:06:23 -0000 On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote: > Is there a good reason for providing static libraries for > libpthread/libkse? I'd like to not support them to get > rid of some hacks to make sure certain symbols are present > in the static library case. That would make static linking threaded applications impossible, no? While I wouldn't mind seeing the whole system move to being dynamically linked, I sure don't feel well about deprecating static linking completely. (No threads for static binaries is very close to `deprecating completely' to me.) Cheers, -- Jacques A. Vidrine http://www.celabo.org/ NTT/Verio SME . FreeBSD UNIX . Heimdal Kerberos jvidrine@verio.net . nectar@FreeBSD.org . nectar@kth.se From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:11:01 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EBBC137B404; Wed, 26 Mar 2003 11:11:00 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id F123F43FBD; Wed, 26 Mar 2003 11:10:59 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QJAqBg005457; Wed, 26 Mar 2003 14:10:52 -0500 (EST) Received: from localhost (eischen@localhost)h2QJAqmK005454; Wed, 26 Mar 2003 14:10:52 -0500 (EST) Date: Wed, 26 Mar 2003 14:10:52 -0500 (EST) From: Daniel Eischen To: "Jacques A. Vidrine" In-Reply-To: <20030326190621.GB34946@madman.celabo.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.3 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@FreeBSD.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:11:32 -0000 On Wed, 26 Mar 2003, Jacques A. Vidrine wrote: > On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote: > > Is there a good reason for providing static libraries for > > libpthread/libkse? I'd like to not support them to get > > rid of some hacks to make sure certain symbols are present > > in the static library case. > > That would make static linking threaded applications impossible, no? Correct. Solaris does not provide static libthread/libpthread. > While I wouldn't mind seeing the whole system move to being > dynamically linked, I sure don't feel well about deprecating static > linking completely. (No threads for static binaries is very close to > `deprecating completely' to me.) Yup. That's what I'm advocating. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:36:00 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7C65237B404 for ; Wed, 26 Mar 2003 11:36:00 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id E269243F85 for ; Wed, 26 Mar 2003 11:35:58 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJZOKu025262; Wed, 26 Mar 2003 11:35:24 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJZOBm011460; Wed, 26 Mar 2003 11:35:24 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QJZO77011459; Wed, 26 Mar 2003 11:35:24 -0800 (PST) Date: Wed, 26 Mar 2003 11:35:24 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20030326193524.GA11320@dhcp01.pn.xcllnt.net> References: <3E81F6BB.BFFE3F33@vigrid.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3E81F6BB.BFFE3F33@vigrid.com> User-Agent: Mutt/1.5.3i X-Spam-Status: No, hits=-30.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:36:03 -0000 On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote: > Is there a good reason for providing static libraries for > libpthread/libkse? I'd like to not support them to get > rid of some hacks to make sure certain symbols are present > in the static library case. I the maintenance cost is low and the hacks are not in the way of progress I think we should keep the static libraries. I think we're throwing something away too carelessly otherwise. For example, the access sequences generated by compilers for variables that have the __thread attribute do really suck for when code is to be generated for dynamic linking. The access sequences in the static case are superior. The performance gain is significant if one can build a complete multi-threaded application. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:42:29 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6679E37B404 for ; Wed, 26 Mar 2003 11:42:29 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8D05843F3F for ; Wed, 26 Mar 2003 11:42:28 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QJgNBg009862; Wed, 26 Mar 2003 14:42:23 -0500 (EST) Received: from localhost (eischen@localhost)h2QJgM6o009859; Wed, 26 Mar 2003 14:42:22 -0500 (EST) Date: Wed, 26 Mar 2003 14:42:22 -0500 (EST) From: Daniel Eischen To: Marcel Moolenaar In-Reply-To: <20030326193524.GA11320@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.3 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:42:31 -0000 On Wed, 26 Mar 2003, Marcel Moolenaar wrote: > On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote: > > Is there a good reason for providing static libraries for > > libpthread/libkse? I'd like to not support them to get > > rid of some hacks to make sure certain symbols are present > > in the static library case. > > I the maintenance cost is low and the hacks are not in the way > of progress I think we should keep the static libraries. I think > we're throwing something away too carelessly otherwise. > > For example, the access sequences generated by compilers for > variables that have the __thread attribute do really suck for > when code is to be generated for dynamic linking. The access > sequences in the static case are superior. The performance > gain is significant if one can build a complete multi-threaded > application. Solaris and IRIX don't seem to provide static thread libraries. Does anyone know if Linux does? -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:48:39 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 1ACF337B404 for ; Wed, 26 Mar 2003 11:48:39 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 45A5A43FA3 for ; Wed, 26 Mar 2003 11:48:36 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.8/8.12.8) with SMTP id h2QJmWjK024258 for ; Wed, 26 Mar 2003 14:48:32 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Wed, 26 Mar 2003 14:48:31 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: arch@FreeBSD.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-17.0 required=5.0 tests=AWL,PATCH_UNIFIED_DIFF,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: M_NOWAIT failure handling -- not so very rosy picture X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:48:40 -0000 I'm running a diskless system with the attached patch; the results have not been so very pleasing. I'm collecting a set of panics and traces to mail out to relevant developers, but it does give one pause. A related patch for the mbuf allocator would probably also give some interesting results. The patch is far from perfect, but has been enough to result in some interesting scenarios. # sysctl debug.malloc_failure_rate=10 # Fail one in ten Some things to try that I've bumped into so far: # sysctl -a > /dev/null # mdconfig -a -s 5m -t malloc # dd if=/dev/zero of=/dev/md0 # newfs /dev/md0 Both of these seem to be storage-related and I've e-mailed phk about them, but I suspect there are a lot of others hanging around. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories Index: kern_malloc.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_malloc.c,v retrieving revision 1.119 diff -u -r1.119 kern_malloc.c --- kern_malloc.c 10 Mar 2003 20:24:54 -0000 1.119 +++ kern_malloc.c 26 Mar 2003 19:27:28 -0000 @@ -138,6 +138,20 @@ /* time_uptime of last malloc(9) failure */ static time_t t_malloc_fail; +#ifdef MALLOC_MAKE_FAILURES +/* + * Cause malloc failures ever (n) mallocs with M_NOWAIT. If set to 0, + * don't cause failures. + */ +static int malloc_failure_rate; +static int malloc_nowait_count; +static int malloc_failure_count; +SYSCTL_INT(_debug, OID_AUTO, malloc_failure_rate, CTLFLAG_RW, + &malloc_failure_rate, 0, "Every (n) mallocs with M_NOWAIT will fail"); +SYSCTL_INT(_debug, OID_AUTO, malloc_failure_count, CTLFLAG_RD, + &malloc_failure_count, 0, "Number of imposed malloc failures"); +#endif + int malloc_last_fail(void) { @@ -187,6 +201,15 @@ #if 0 if (size == 0) Debugger("zero size malloc"); +#endif +#ifdef MALLOC_MAKE_FAILURES + if ((flags & M_NOWAIT) && (malloc_failure_rate != 0)) { + atomic_add_int(&malloc_nowait_count, 1); + if ((malloc_nowait_count % malloc_failure_rate) == 0) { + atomic_add_int(&malloc_failure_count, 1); + return (NULL); + } + } #endif if (flags & M_WAITOK) KASSERT(curthread->td_intr_nesting_level == 0, From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:51:11 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 69BE937B404 for ; Wed, 26 Mar 2003 11:51:11 -0800 (PST) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id B229C43F3F for ; Wed, 26 Mar 2003 11:51:10 -0800 (PST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.7/8.12.7) id h2QJp7ko021588; Wed, 26 Mar 2003 13:51:07 -0600 (CST) (envelope-from dan) Date: Wed, 26 Mar 2003 13:51:07 -0600 From: Dan Nelson To: Daniel Eischen Message-ID: <20030326195107.GB31787@dan.emsphone.com> References: <20030326193524.GA11320@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 5.0-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i X-Spam-Status: No, hits=-26.8 required=5.0 tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Marcel Moolenaar Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:51:12 -0000 In the last episode (Mar 26), Daniel Eischen said: > On Wed, 26 Mar 2003, Marcel Moolenaar wrote: > > For example, the access sequences generated by compilers for > > variables that have the __thread attribute do really suck for when > > code is to be generated for dynamic linking. The access sequences > > in the static case are superior. The performance gain is > > significant if one can build a complete multi-threaded application. > > Solaris and IRIX don't seem to provide static thread libraries. Does > anyone know if Linux does? Debian provides static versions: -rw-r--r-- 1 root root 81959 Feb 25 07:46 /lib/libpthread-0.10.so -rw-r--r-- 1 root root 97286 Feb 25 07:47 /usr/lib/libpthread.a As does Redhat 7.3: -rwxr-xr-x 1 root root 105945 Oct 10 09:51 /lib/libpthread-0.9.so* -rw-r--r-- 1 root root 118146 Oct 10 09:51 /usr/lib/libpthread.a -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 11:51:25 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E115237B409 for ; Wed, 26 Mar 2003 11:51:25 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 924C943FCB for ; Wed, 26 Mar 2003 11:51:23 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJopKu025379; Wed, 26 Mar 2003 11:50:51 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJopBm011499; Wed, 26 Mar 2003 11:50:51 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QJopmH011498; Wed, 26 Mar 2003 11:50:51 -0800 (PST) Date: Wed, 26 Mar 2003 11:50:51 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20030326195051.GB11320@dhcp01.pn.xcllnt.net> References: <20030326193524.GA11320@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.3i X-Spam-Status: No, hits=-30.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 19:51:28 -0000 On Wed, Mar 26, 2003 at 02:42:22PM -0500, Daniel Eischen wrote: > > Solaris and IRIX don't seem to provide static thread > libraries. Does anyone know if Linux does? That's because they have abandoned static libraries completely, if I'm not mistaken. Since we still link against archive libraries is general, our decision to drop an archive threads library can not really be based on that example alone. -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 12:06:34 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 01FD337B411 for ; Wed, 26 Mar 2003 12:06:34 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id F1B6143F75 for ; Wed, 26 Mar 2003 12:05:18 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QK5CBg013418; Wed, 26 Mar 2003 15:05:12 -0500 (EST) Received: from localhost (eischen@localhost)h2QK5BMJ013415; Wed, 26 Mar 2003 15:05:11 -0500 (EST) Date: Wed, 26 Mar 2003 15:05:11 -0500 (EST) From: Daniel Eischen To: Marcel Moolenaar In-Reply-To: <20030326195051.GB11320@dhcp01.pn.xcllnt.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.3 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 20:06:37 -0000 On Wed, 26 Mar 2003, Marcel Moolenaar wrote: > On Wed, Mar 26, 2003 at 02:42:22PM -0500, Daniel Eischen wrote: > > > > Solaris and IRIX don't seem to provide static thread > > libraries. Does anyone know if Linux does? > > That's because they have abandoned static libraries completely, > if I'm not mistaken. Since we still link against archive libraries > is general, our decision to drop an archive threads library can > not really be based on that example alone. I don't think that's the case with Solaris. As of Solaris 9, there are 40-50 static libraries in /usr/lib: gpz [65] $ uname -a SunOS gpz 5.9 Generic sun4u sparc SUNW,Ultra-80 gpz [64] $ ls /usr/lib/lib*.a /usr/lib/lib300.a /usr/lib/libcurses.a /usr/lib/libnls.a /usr/lib/lib300s.a /usr/lib/libelf.a /usr/lib/libnsl.a /usr/lib/lib4014.a /usr/lib/libform.a /usr/lib/libpanel.a /usr/lib/lib450.a /usr/lib/libgen.a /usr/lib/libpkg.a /usr/lib/libTL.a /usr/lib/libgenIO.a /usr/lib/libplot.a /usr/lib/libadm.a /usr/lib/libintl.a /usr/lib/librac.a /usr/lib/libadt_jni.a /usr/lib/libl.a /usr/lib/librpcsvc.a /usr/lib/libbsdmalloc.a /usr/lib/libldfeature.a /usr/lib/libsec.a /usr/lib/libbsm.a /usr/lib/libm.a /usr/lib/libsocket.a /usr/lib/libc.a /usr/lib/libmail.a /usr/lib/libtermcap.a /usr/lib/libc2.a /usr/lib/libmalloc.a /usr/lib/libtermlib.a /usr/lib/libc2stubs.a /usr/lib/libmapmalloc.a /usr/lib/libvolmgt.a /usr/lib/libcmd.a /usr/lib/libmenu.a /usr/lib/libvt0.a /usr/lib/libcrypt.a /usr/lib/libmp.a /usr/lib/libw.a /usr/lib/libcrypt_i.a /usr/lib/libnisdb.a /usr/lib/liby.a gpz [68] $ ls /usr/lib/lib*thread* /usr/lib/libpthread.so /usr/lib/libthread.so /usr/lib/libthread_db.so /usr/lib/libpthread.so.1 /usr/lib/libthread.so.1 /usr/lib/libthread_db.so.1 IRIX also doesn't seem to provide static thread libraries. Just because Solaris and IRIX doesn't mean we shouldn't; I'm just using those as examples. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 12:30:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CF1DC37B404 for ; Wed, 26 Mar 2003 12:30:13 -0800 (PST) Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id 15C8543F85 for ; Wed, 26 Mar 2003 12:30:13 -0800 (PST) (envelope-from imp@harmony.village.org) Received: from harmony.village.org (localhost [127.0.0.1]) by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2QKU6A7089578; Wed, 26 Mar 2003 13:30:06 -0700 (MST) (envelope-from imp@harmony.village.org) Message-Id: <200303262030.h2QKU6A7089578@harmony.village.org> To: Daniel Eischen In-reply-to: Your message of "Wed, 26 Mar 2003 13:51:39 EST." <3E81F6BB.BFFE3F33@vigrid.com> References: <3E81F6BB.BFFE3F33@vigrid.com> Date: Wed, 26 Mar 2003 13:30:06 -0700 From: Warner Losh X-Spam-Status: No, hits=-9.9 required=5.0 tests=IN_REP_TO,REFERENCES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 20:30:14 -0000 In message <3E81F6BB.BFFE3F33@vigrid.com> Daniel Eischen writes: : Is there a good reason for providing static libraries for : libpthread/libkse? I'd like to not support them to get : rid of some hacks to make sure certain symbols are present : in the static library case. That would be a big hassle for the company I work for. We have many static binaries that are threaded and providing a dynamic one has a performance impact of a few percent. While we have done dynamic linking in the past, and have the infrastructure to do so in the future in our build process, this may cause us problems in the future if we need to deploy a static binary (which tends to be safer to do once a long period of time has passed between the generation of the system and the deployment of the updated binary). How gross are the hacks? Warner From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 12:30:57 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 658C637B404; Wed, 26 Mar 2003 12:30:57 -0800 (PST) Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6B84543F93; Wed, 26 Mar 2003 12:30:56 -0800 (PST) (envelope-from julian@elischer.org) Received: from interjet.elischer.org (12-232-168-4.client.attbi.com[12.232.168.4]) by sccrmhc02.attbi.com (sccrmhc02) with ESMTP id <2003032620305400200jbp34e>; Wed, 26 Mar 2003 20:30:55 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA52593; Wed, 26 Mar 2003 12:30:54 -0800 (PST) Date: Wed, 26 Mar 2003 12:30:52 -0800 (PST) From: Julian Elischer To: Jeff Roberson In-Reply-To: <20030326031245.O64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-24.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REPLY_WITH_QUOTES, USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 20:30:59 -0000 On Wed, 26 Mar 2003, Jeff Roberson wrote: > I don't understand? There are relatively minor changes to the kernel to > support this. Since nice is a property of the process, it makes sense > that there is only one ksegrp per process. I'm starting to think that the > ksegrp was overkill in general. > Instead of making a new KSE for each thread (and thereby blowing out the code that expects that NKSE <= NCPU per KSEGRP, allocate each KSE in a new KSEGRP. The overhead is not that much and you will keep NKSE/KSEGRP <= NCPU. You will also be able to support system scope threads with differnet priorities, which is a Posix requirement. It is the equivalent of adding teh "NEWKSEGRP" flag to each thread creation call, and should have no other ramifications. It will also mean that you can se this system if people add a scheduler that doesn't have KSEs (as discussed previously) (by durectly scheduling threads). > > Specifically since My plan is to make the "KSE' structure go away.. > > (by which I mean it is only going to be visible within the particular > > thread_scheduler that uses it and that externally > > the only structures visible would be : > > proc, ksegrp(subproc?) thread and upcall. > > For M:N I really think this should be proc, thread, and upcall. > For 1:1 I only need proc and thread. For your definition, define "1 thread" as: "A thread with an attached KSE and KSEGRP" instead of: "A thread with an attached KSE" The logic will be very similar but you will get better functionality by being able to give different threads different priorities etc. The KSEGRP structure is small. You will not lose much by doing this.. This is how a 1:1 scheme was envisioned and the items in the different substructures were distributed to work best in this way.. > > > The KSE would be allocated only by a call into the scheduler and is part > > of the "scheduler specific private data". > > > > i.e. on creation of a new process, shced_newproc() is called > > and a KSE is added in there is the scheduler in question wants to use > > KSEs. If it doesn't, no KSE would be added, but it's still possible that > > Yes, I think we need more sched hooks here as well. Having only > sched_fork() makes things sort of gross. We'll have to hook this all up > later. I'll try get it hooked up "sooner rather than later". I think you can make 1:1 threads in the current system by doing: kse_create(mbox, NEWGROUP); where the mbox points to the function you want to run and a new stack. The function just runs as normal, not knowing that it is atually a UTS thread. Since it never yields to another thread (by KSE terms) it never does any upcalls an voila.. 1:1 threads. (I am sugesting that we don't need a new syscall to do this, or, at most a new entrypoint which ends up calling much of the same code.) Ok but htis breaks things for M:N threads as in M:N threads, teh mask would be stored "per process" (or at most per group) and the mask is the "logical OR" of all the masks for the threads in the group/process. Having a mask per thread and not having one for the bigger unit means that the masks for the threads must be updated regularly (maybe at every kernel entry) to be the OR of the masks for ALL THE USER THREADS, which means that the UTS must do this explicitly. I'm not thrilled by all the extra work this is going to make for M:N threads. (Well at least this is my preliminary reading of it.) > > > - if (p1->p_flag & P_THREADED) { > > + if (p1->p_flag & P_THREADED || p1->p_numthreads > 1) { > > > > If you are running threads, please set the P_THREADED flag. > > if you wnat do differentiate between upcalling threads and 1:1 > > threads, please use some auxhilliary flag. > > I'd rather not have a flag. The > 1 check is used only in places where we > have to suspend multiple threads or go to single threading etc. Processes > in the 1:1 threading model aren't so special as they are with KSE. They > don't need to be treated specially except when we're trying to funnel them > down etc. Ok, well we'll see with time how it works out and if it is ok, that;s fine.. If it needs work we can do it then.. this will do for now. > > > You should be creating a new KSEGRP (subproc) per thread. > > I think you will find that if you do, things will fall out easier > > and you won't break the next KSE changes. > > I don't understand what I may break? You are allocating a thread and a KSE.. KSEs may go away (from being visible to you). If you are referencing them then things will break. > > I don't think M:N is the way to go. After looking things over and > considering where it is theoretically faster I do not think it is a > worthwhile pursuit. I agree with yo an may ways, and I think that given teh choice I'd run KSEs in "P:Q" mode.. (where we don't multiplex any sleeping threads and have effectively one kernel thread per sleeping thread". However M:N threads has one advantage. That is where peopel use the programming model that makes a thread for every object in a program. This scheme can lead to tens of thousands of small threads in userland. effectively you do NOT want those to all be kernel threads. Tere are languages and libraries that promote such programming models. effectively each object in teh program is an independent intelligent entity with its own stack and such.. I would like to be able to support this. > > First off, it is many months away from being even beta quality. I think > the UTS is far more complicated than you may realize. There are all sorts > of synchronization issues that it was able to avoid before since only one > thread could run at any time and there essentially was no preemption. It > now also has to deal with effecient scheduling decisions in a M:N model > that it didn't have to worry about before. I'm not sure that teh issues there are as bad as you think. > > Aside from that, there are numerous problems with the kernel not being > able to identify individual threads of execution. Debugging, scheduling, > profiling, ktrace are all more difficult in a m:n environment. I think it > is going to contribute to less effecient scheduling decisions over all. I > have already wrestled with this in ULE. You are right about some parts of this, but it IS possible to do these things. > > I feel that this is an overwhelming amount of complexity. Because of this > it will be buggy. Sun claims that they still have open tickets on their > M:N while their new 1:1 implementation is totally bug free. How long have > they been doing m:n? I don't think that with our limited resources we're > going to be able to do better. I think that the complexity of the KSE M:N model is a lot less that what sun did. > > Furthermore, m:n's basic advantage is less overhead from staying out of > the kernel. Also, less per thread resources. I think this is bogus for a > couple of reasons. It is bogus for a particular class of threaded applications and true for a particular class of threaded apps. > > First, if your application has more threads than cpus it is written > incorrectly. Not neccesarily. that's just one way of looking at threads. Active component threaded programs use threads as a programming model (see above) and it is a perfectly valid way of writing a program. Remember.. "Ours is not to specify how a programmer writes, but to allow the programmer to have a s wide a choice as possible about what he wants to do." > For people who are doing thread pools instead of event > driven IO models they will encounter the same overhead with M:N as 1:1. True. That is one model of threading..B In an IO bound app all waiting threads will have kernel contexts so in effect it approaches 1:1 > I'm not sure what applications are entirely compute and have more threads > than cpus. These are the only ones which really theoretically benefit. I > don't think our threading model should be designed to optimize poorly > thought out applications. As I said. there are people who like this method of programming. I don't want to have to say "we only support model A of thread programming, and if you want model B, it'll really suck." (This is not saying we shouldn;t have a 1:1 library available.. it's a good idea). > Furthermore, the amount of work done per slice has been growing with > processor speeds. Slice time is adjusted for user experience and so it > remains constant. This means that the constraints are different from when > this architecture started to come about many (10 or so?) years ago. > Trying to optimize context switches between threads just doesn't make > sense when you do so much work per slice. This is a good argumant, but it still doesn;t make sence to have 10,000 threads all in the kernel. > > Then if you look at the number of system calls and shenanigans a UTS must > do to make proper scheduling decisions it doesn't look like such an > advantage. I feel that the overhead of all the layers comes close to the > savings from doing some of it without entering the kernel. So far it's not doing that much.. From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 12:31:27 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D3CD737B404 for ; Wed, 26 Mar 2003 12:31:27 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0E4C543F93 for ; Wed, 26 Mar 2003 12:31:27 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QKUsKu025573; Wed, 26 Mar 2003 12:30:54 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QKUsBm011609; Wed, 26 Mar 2003 12:30:54 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QKUsbL011608; Wed, 26 Mar 2003 12:30:54 -0800 (PST) Date: Wed, 26 Mar 2003 12:30:54 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20030326203054.GC11320@dhcp01.pn.xcllnt.net> References: <20030326195051.GB11320@dhcp01.pn.xcllnt.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.3i X-Spam-Status: No, hits=-30.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 20:31:28 -0000 On Wed, Mar 26, 2003 at 03:05:11PM -0500, Daniel Eischen wrote: > > Just because Solaris and IRIX doesn't mean we shouldn't; > I'm just using those as examples. My point really is that if you have good reasons (good reasons for us) to drop the archive threads library then you should go for it. Precedence is a good way to make your case, but what applies in those cases may not apply to us, so what may have been good reasons for them may not be good reasons for us. Thus, you have to know (roughly) why they have dropped the archive library if you want to use them as examples. Just stating that it isn't there may just as well mean that it hasn't been installed (or bought), not that they don't have it. I know HP doesn't have it, but they dropped archive libraries completely. And as far as I know they followed Sun's example (as they so often do). Old archive libraries may still be provided for backward compatibility though... -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 12:55:42 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 98E4C37B404 for ; Wed, 26 Mar 2003 12:55:42 -0800 (PST) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id 656E143F75 for ; Wed, 26 Mar 2003 12:55:41 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0166.cvx22-bradley.dialup.earthlink.net ([209.179.198.166] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18yHvf-00054D-00; Wed, 26 Mar 2003 12:55:24 -0800 Message-ID: <3E821365.6B036B0D@mindspring.com> Date: Wed, 26 Mar 2003 12:53:57 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Jeff Roberson References: <20030326053115.T64602-100000@mail.chesapeake.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4cde4d4298c38105b3802a607afd9c067a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c X-Spam-Status: No, hits=-21.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,QUOTE_TWICE_1, RCVD_IN_OSIRUSOFT_COM,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 20:55:46 -0000 Jeff Roberson wrote: > On Wed, 26 Mar 2003, Terry Lambert wrote: > > Jeff Roberson wrote: > > > Well, I wasn't doing userland stuff until three days ago. I think mini > > > has just been very busy with work. I suspect that you're going to need > > > to start doing userland work or find someone to do it if you want to get > > > it done soon. > > > > In theory, the library API will be identical to the pthreads > > standard, and will require no changes to programs written to > > that standard. Most threaded programs these days are written > > to the standard. Some threaded programs make invalid assumptions > > about rescheduling following an involuntary context switch, or > > ability to make particular blocking calls. > > I'm not sure what API compatibility has to do with anything? The API is the only thing that matters about userland work, apart from side effects of assumptions which end up visible to users of the API, due to it being non-reflexive, of course. In other words, your suspicion about the userland work is incorrect. > > The first will still be a problem (e.g. Netscape's Java/JavaScript > > GIF rendering engine is probably still serializing requests due to > > non-thread reentrancy). > > > > The second should not be an issue, either with your implementation > > of the 1:1, or Jon Mini's implemenetation of the N:M model. > > I'm not sure I know what you're talking about. Blocking calls are either > handled by an upcall in M:N or by having independent contexts in 1:1. The current FreeBSD user space pthreads implementation causes versions of Netscape that do not intentionally serialize GIF renderings in Java/JavaScript UI's, and which contain multiple images, to fail catastrophically, if the mouse was moved over the GIF during loading. This was true for some implementations of Slashdot, and it was also true for the InterJet UI, as of version 3.x of WhistleWare (based on FreeBSD 3.x), until specific code was included in the UI to delay GIF request processing so as to serialize requests. The specific problem, which I suspected from symptoms, and later confirmed by decompiling the Netscape code in question was that the rendering engine was non thread-reentrant, and was making an assumption about scheduler behaviour that was not warranted by anything other than a kernel threading implementation that would guarantee resumption of the previously preempted thread. This assumption was true on Windows, true on Linux, true on Solaris, but false on FreeBSD and false on MacOS 9. And, in fact, we saw UI crashes on FreeBSD and Mac OS 9, which did not occur on other platforms. The problem is in the POSIX interface not enforcing against people making unwarranted assumptions in their code which uses the POSIX interface. Therefore, it is possible that some bugs in vendor code will be revealed by differences in implementation. The Netscape 4.x GIF renderer is one such piece of code. For the record, blocking calls can be handled in the 1:1 case through upcalls, as well. It's an implementation detail that's irrelevent to the API that gets exposed. > > > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user > > > > thread? Having one ksegrp and many KSEs requires changing the kernel > > > > code where doing it the other way you could do it without making any > > > > changes. > > > > > > I don't understand? There are relatively minor changes to the kernel to > > > support this. Since nice is a property of the process, it makes sense > > > that there is only one ksegrp per process. I'm starting to think that the > > > ksegrp was overkill in general. > > > > The KSEGRP is, effectively, a virtual processor interface, and > > was/is intended for use by the scheduler to ensure CPU affinity > > for individual threads, and CPU negaffinity for multiple threads > > within a process. In other words, according to the published > > design documents, it's a scheduler artifact. > > This is the KSE, not the KSE group. The KSE Group was intended to allow > multiple groups of threads with different scheduling algorithms or > different base priorities (nice). This is *one* of a number of *important* effects of the KSEGRP; the other effects are subtle, but are no less important. > > Personally, I've never seen the need for virtual processors, but > > then I've always advocated "intentional start/intentional migration" > > for the scheduler model (one of my arguments for a per CPU migration > > queue, and a push-model, rather than a pull-model for redistribution > > of an unbalanced load). > > The push model suffers from as much as a one tick latency in any > migration. In many cases probably more than that. You've made this statement before, and then never defended it. I think you mean that it has a latency of .5 of the average of *quantum used*, +/- .5, in the worst case scenario. That happens when you push a process to another CPU, and it doesn't notice until the next context switch (which is *rarely* a full quantum, and is *usually* a fraction of a quantum). This is because you are incorrectly pushing the process at the head of the run queue. This will, indeed, introduce the latency you suggest (a smaller latency than you imply, BTW, and probably ignorable, in fact). HOWEVER. The *correct* implementation will push the *second from the head*, and then schedule the head to run. This will ensure that the latency is no more than 1 full quantum *for a thread that would not run for a maximum of 1 full quantum. In addition, since it is being pushed to the least loaded CPU, and load is a measurement of number of items pending on the ready-to-run queue, I would argue that, in fact, the push model results in *significantly reduced* latency, on average, compared to the pull model. In addition, it eliminates all scheduler lock contention in the common case, which is the non-migration case. BTW, with a *pull* model, which requires an average of 1.5 lock contentions in order to accomplish a context switch for an individual CPU in a 2 CPU system, on a 3GHz processor with a 433MHz front size bus, this equates to ~7 clock cycles worth of stall barrier per lock contention, or 10.5 clocks per scheduler lock acquisition *by its own CPU*, even if you decide not to migrate *anything*. That *assumes* that all locks are allocated so as to fall on cache line boundaries, which FreeBSD *FAILS* to do. > The overhead from locking queues is far out weighed by the > cases where your cpu is sitting idle due to an unbalanced load. I suggest your numbers are in error. Please examine the papers on this topic in the book: Scheduling and Load Balancing in Parallel and Distributed Systems Behrooz A. Shirazi IEEE Computer Society ISBN: 0818665874 It is a compilation of IEEE papers on the topic, and contains dozens of papers with statistics that refute your claims for shared memory multiprocessors. And most of these papers assume the clock multiplier is very small, if it exists at all. On modern systems, the amount of overhead in their statistics for locking should be multiplied considerably, since the clock multiplier is much higher than it was. > The latency is one tick because each cpu would have to poll the load of > other cpus at some interval to discover that it is out of balance. Or it > could check if it had more than one process on the run queue, which seems > a bit silly. Regardless, you're probably only going to get to make this > decisions once a tick which means the other cpu(s) can sit idle for at > least that long. I have one question: how did you get into this unbalanced load situation, where you have 100 processes ready to run on one CPU, and 0 processes ready to run on the remaining 3 CPUs? I argue that this situation might in fact be an initial state, but the steady state over time would be to evenly distribute work, over time. In other words, your example refers to what's called a "flash crowd" case -- roughly equivalent to a fork-bomb. And I have already stated, at least for the thread creation case, that I support "intentional start", where the CPU you pick to put an initial new thread on, is based on the load. So I do not understand how this situation could arise, other than in laboratory conditions. > Consider a buildworld -j8. You have many processes rapidly stoping and > starting. Without a pull a cpu that was very loaded could suddenly end up > with no running processes and have to idle until the other gave it work. > This imbalance is likely to go back in forth, I have observed this > personally when writing ULE. The rapidity of the start/stop is irrelevent. The instantaneous load at the time of the next start *is* relevent. I would argue that steady-state performance is more important; further, I would suggest that, in using "make -j#", that you select "#" to be a factor of 3 or more larger than the number of available CPUs, to ensure correct hysteresis. If this still doesn't fix your problem, I would suggest that the duration of your quantum (lbolt value) is too large, compared to how long processing actually occurs, before it hits a blocking sleep call. > I think you need both push and pull. The pull satisfies the case where > you have short lived but rapidly reappearing processes. The push > solves more long term load imbalance issues. If you have, for example, > many apache processes that are very busy. no cpu will go idle, so pull is > ineffective, but they may still be imbalanced. This is still missing from > ULE. I don't think you can implement pull without locking your own queue in order to access it. I *know* you can implement push without locking your own queue, *ever*, and then only deal with the locking of a per CPU auxillary queue when you decide you have to migrate a process. I argue that this should occur only *rarely*: it is not the common case. Further, I don't think you can implement both push an pull in the same implementation, reasonably. The problem comes down to whether or not you engage in the examination of another CPUs scheduling queue. If you do this, you have to lock, and you end up stalling both CPUs in order to do this. This is a factor of 2 multiplation, minimally, on the stall, and is probably a heck of a lot more, in FreeBSD, since all the other CPUs are doing the same thing, and you have L1 and L2 cache flushes and TLB shootdowns, etc., as a result. > > > In a scheduler model where a sheduler *pulls* work, either from > > another CPU ready-to-run queue, or from a single ready-to-run > > queue that is global to the system (in either case, requiring > > locks in the scheduler path, potentially highly contended locks), > > the idea of a KSEGRP/"virtual processor" is necessary for globally > > migratable and contendable "bookkeeping" objects. > > They should only be contended when cpus have nothing to do. A worthwhile > tradeoff I'd say. Define "nothing to do". The cached lock structure gets zapped in all other processes which have it read-caches, as soon as it's written by any CPU to acquire the lock. Minimally, it's reread, as necessary, from the L2 cache (the last operation is a write to release the lock). Worst case, it's main memory, and your stall goes up by a factor of 4. It's clear to me that shared memoy SMP systems with large clock multipliers *must* pretend that they are distinct NUMA CPUs, as much as possible, in order to avoid stall barriers. BTW: the pull model does not work for NUMA systems, since the memory you are attempting to examine is non-local, and a distributed cache coherency and messaging protocol must be used to get the data -- if it's available at all. So FreeBSD SMP is screwed from ever running on 64 processor SPARC boxes (for example), if it uses the pull model. The push model, on the other hand, can message the process into the queue of the target CPU, using the built-in hardware messaging mechanism (a cooperative transfer of the image has to happen as a result of the message, but that particular overhead is largely avoidable, using state synchronization via swap, and latency can be further reduced through checkpointing). > > So in the current scheduler implementations, KSEGRP is necessary; > > in the 1:1 model, it's necessary, if only to ensure negaffinity > > (4 CPU system, process with 4 threads, ensure each thread gets its > > own CPU, and does not migrate away from it). > > You're talking about the KSE again. I think CPU affinity has little to do > with the M:N or 1:1 choice except that it is much more difficult to > achieve CPU affinity when you have to make a multitiered scheduling > decision. To get real affinity in M:N you need kse to cpu affinity and > thread to kse affinity. You also the need userland thread to kernel > thread affinity, or at least user land thread to KSE affinity. What do you think is on the scheduler queue or the wait queue, if it's not a KSE? There's no such thing as a thread, distinct from the context in which it exists. > > You could also take this idea much further. Specifically, SVR4 > > flags system calls as "non-blocking", "blocking", and "potentially > > blocking". By doing this, they can lazy-bind context creation for > > blocking operations on "blocking" and "potentially blocking" calls, > > and avoid it altogether on "non-blocking" and sometimes avoid it on > > "potentially blocking" calls. > > KSE already does better than this by only creating a new context when you > actually block. The upcall mechanism specifically addresses that need. > This is seperate from what we were discussing above which is allowing the > scheduler to have a chance to initialize data when a new context is > created. The point is that there is "low hanging fruit". By knowing up front that there is no chance of blocking, you can play "fast and loose". It seems to me from watching the -CURRENT code, that people can't decide if they are grabbing locks to protect data objects, or locks to protect code paths. This resolves a lot of the redundant locking that happens by giving only a single rule of thumb, and a place where it can be ignored. > > This can result in a significant overhead savings, if the kernel > > implementation evolves, but the user space implementation remains > > fixed. > > > > It's good to decouple these things from each other (IMO). > > Which things? The idea of kernel entrancy, and the continued need for a context which can be put on a sleep queue vs. put on a scheduler queue. That's not distinct in the current implementation. In fact, the same list element pointer in the same structure is used to link both lists. > > Everyone does this. Novell did it back in 1993. Sun's turnstiles > > are based on the tradeoff between spinning and waiting, and how > > many times you have to do that before it's worth crossing the > > protection domain, and blocking. > > I think you mean sun's adaptive mutexes. The turnstile is just the > queue that you block on if I'm remembering correctly. The blocking queue > I used for umtx is a similar context where the queue migrates among the > blocking threads. Yes, adaptive mutexes, sorry. > > When we did this in 1993 (Novell's implementation was primarily > > by Dave Hefner, who now works for Microsoft, I believe), we ended > Any relation to hugh? He hates that. 8-). > > > > My only comment is that since mini is supposed to be doing the > > > > M:N library, isn't this a bit of a distraction? > > > > > > I'll let him comment on this. > > > > I'll stick my nose in: I think it's a good idea, since TPTB have > > recently made noises on a couple of FreeBSD lists about "rapidly > > approaching deadlines for the KSE work". > > > > Consider it insurance on your investment, people. > > Yes, it isn't necessarily a KSE replacement. But maybe it is, and will be for 6 months, or a year, if it uses the same kernel mechanisms for its implementation. That's why Julian's comments about the kernel changes are important. Note: I'm not saying they aren't actually necessary, only that they merit discussion. So far, the justifications you've offered all revolve around your percieved irrelevancy of KSEGRP seperate from process, as a container object. This is true in ULE, as you've implemented it so far, but it's probably not true, overall. > > There is also the fact that affinity and quantum are very hard to > > maintain on a system with a heterogeneous load. In other words, > > 1:1 looks good if the only thing you are running is a single > > multithreaded proces, but looks a *lot* less good when you start > > running real-world code instead of fictitious benchmarks that > > try to make your threading look good (e.g. measuring only thread > > context switches, with no process context switch stall barriers, > > etc.). > > Yes, I see what you're getting at. M:N allows you to keep running until > you've exhausted your whole slice by selecting another thread. You could > acomplish this in 1:1 by loaning your slice to the next available thread > that was bound to the same cpu and force a switch to that. That's a neat > idea. I'll have to look into this for ule. It's hard to do correctly in the kernel, because the scheduler that's making the decision has to either support a variable quantum granularity (I've seen it implemented that way before, but it's ugly), or it has to try and make "fairness" decisions that it's not in a position to make. For example, a thread calls and gives up it's quantum, and then other threads in the same process run, because you're not out of quantum, and then the first threads wait condition is satisfied: who do you schedule first? The answer has to be a PTHREAD_SCOPE_PROCESS prioritization policy. 8-(. > > I can tell you from personal experience with such a model, that > > it *VASTLY* outperforms a 1:1 kernel threading model, even if you > > end up running multiple state-machine instances on multiple CPUs. > > We got more than a 120X increase in NetWare for UNIX, simply by > > changing the client dispatch streams MUX to dispatch to worker > > processes instead of threads, in LIFO instead of FIFO order, > > simply because it ensured that the process pages you cared about > > were more likely to be in core. > > Yeah, the LIFO trick is widely used. I believe apache does something of > this sort. It's also discussed on the c10k problem page. I'm not sure > why you got better perf out of processes than threads though. This is > sort of confusing. I could avoid competing with other processes in the system for scheduler quantum, and overall scheduler usage, and system time, as a result, were reduced. This was partially a result of the "quantum lending" I spoke of; it was actually called "It's my damn quantum!" in the presentation we made. 8-). The idea is that if the system gives me a quantum to use... it's my damn quantum! And I should not have to sacrifice it, merely because I have a single context out of many that wants to make a call that would block. By using this approach, if you are running heterogeneous processes, using 1/16th of your quantum doesn't result in you paying a complete context switch overhead for having all your threads compete with, say, "cron", running once a second -- if you lose, you pay a full context switch overhead. The kernel boundary crossing is also very expensive in SVR4; FreeBSD has reduced this somewhat, but it's still pretty far behind Linux, in this regard, so it's not as cheap to switch threads in kernel space as in user space. It's not under Linux, either, but they only every benchmark homogenous threads in a single application on a relatively quiescent system. There are lies, damn lies, and statistics... then, there's benchmarks. > > 1:1 threading is useful for one thing, and one thing only: SMP > > scalability of single image processes. And it's not the best at > > doing that. > > It's also good at providing extra contexts to block on for IO > worker threads. So's AIO, and it works more efficiently. So does kqueue, for that matter. > Furthermore, It's really good at being implemented quickly, > which is especially important considering that it's 2003 and we > don't have kernel supported threads... OK, can't aregue with that one. It's one of the reasons I liked that you did your implementation in the first place. 8-). > > > Furthermore, m:n's basic advantage is less overhead from staying out of > > > the kernel. > > > > No, actually, it's the ability to fully utilize a quantum, and > > to not have to make a decision between one of your own threads > > and some other process, off the run queue, when making a decision > > in the scheduler about what to run next. > > Yeah, I just remembered this bit. See my answer above. I think I'll do > this trick in ULE. Good luck... it's very hard to do in a kernel scheduler, without overly complicating things, I'm afraid. > > > For people who are doing thread pools instead of event driven IO > > > models they will encounter the same overhead with M:N as 1:1. > > > > This is actually false. In 1:1, your thread competes with all > > other processes, in order to be the next at the top of the run > > queue. Statitically, you are doing more TLB flushes and shootdowns, > > and more L1 and L2 cache chootdowns, than you would otherwise. > > This is the same argument about using your whole slice eh? It's the inverse. It's what gives the lie to most "benchmarks", and why, if you are running a web server with CGIs, you get much more terrible performance than your threads people said youw would get. 8-). > > Solving this problem without intentional scheduling has been > > proben to be N-P incomplete: it is not a problem which is > > solvable in polonomyial time. > > eh? Which problem is NP? Solving the "Who do I run next to balance saving context switches vs. fairness?", if you treat each voluntary context switch as a restart of the timer until the next involuntary context switch. Even lending is hard, once you get into the timer code and see the evil things it does to get the lbolt clock, and the timer optimizations on system call exit. 8-(. But at least it's not NP incomplete. 8-). > > > I'm not sure what applications are entirely compute and have more threads > > > than cpus. These are the only ones which really theoretically benefit. I > > > don't think our threading model should be designed to optimize poorly > > > thought out applications. > > > > By that argument, threads should not be supported at all... 8-) 8-). > > I meant to say 'entirely compute bound'. If you just want CPU and no IO > then you probably only want as many threads as processors. This is the > most effecient arrangement. I'm not arguing against threads although I do > think they are often abused. If the intent is optimization, the answer is never threads; that was my point. We would be teaching people to build finite state automata, instead, and managing their own contexts. I would even argue that the code you get was better, since it would ensure all your per session state never ended up in globals. 8-) 8-). > > But by your same arguments, CPU clock multipliers have grown > > to the point that memory bus and I/O bus stalls are so > > expensive that SMP makes no sense. > > I migh agree with you there. Yeah, they've pissed me off, ever since my 486DX-50 (*not* DX/2-50!). 8-). > > > Then if you look at the number of system calls and shenanigans a UTS must > > > do to make proper scheduling decisions it doesn't look like such an > > > advantage. [ ... ] > > I think the kernel boundary crossing overhead, and the fact > > that, in doing so, you tend to relinquish a significant > > fraction of remaining quantum (by your own arguments) says > > that protection domain crossings are to be avoided at all costs. > > Yes, I agree, and without serious tweaking our current M:N significantly > increases the number of system calls. Yes. The signal masking is particular heinous. I don't know what to do about it. 8-(. My gut reaction is "BSD signals"; before all this POSIX crap turned BSD into SVR3, interrupted system calls restarted by default. There's a nice threads package from ~1988 that used this fact, called "sigsched"; it's in the comp.sources.unix archives. Doesn't work any more, unless you call siginterrupt() and then avoid POSIX signal interfaces. 8-(. > > I'm glad you pursued it, even though I do not agree with your > > reasoning on the value of N:M vs. 1:1. I view it as "life > > insurance" for the KSE code, which some people might be > > otherwise tempted to rip out over some arbitrary deadline. > > > > Thank you for your work here, and thank everyone else for > > their work, too. > > Thanks for the feedback. It has been stimulating. I still need to > consider multithreading implications of 1:1 for ULE. This has given me a > bit more to work on there. I wish you had been at the original SMP meetings with Jason Evans, Matt Dillon, and the 50+ other folks who showed up each time; it would be a lot easier if everyone had the same context. 8-(. In the quantum lending, be sure that you look carefully at the involuntary context switch timer, and when it gets reset. It's scary in there. 8-). -- Terry From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 12:58:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8E2B537B404 for ; Wed, 26 Mar 2003 12:58:13 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id B0EFF43F75 for ; Wed, 26 Mar 2003 12:58:12 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QKw4Bg023185; Wed, 26 Mar 2003 15:58:04 -0500 (EST) Received: from localhost (eischen@localhost)h2QKw3Ck023180; Wed, 26 Mar 2003 15:58:03 -0500 (EST) Date: Wed, 26 Mar 2003 15:58:03 -0500 (EST) From: Daniel Eischen To: Julian Elischer In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: kse@elischer.org cc: arch@freebsd.org Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 20:58:15 -0000 On Wed, 26 Mar 2003, Julian Elischer wrote: > > On Wed, 26 Mar 2003, Jeff Roberson wrote: > > > > > > i.e. on creation of a new process, shced_newproc() is called > > > and a KSE is added in there is the scheduler in question wants to use > > > KSEs. If it doesn't, no KSE would be added, but it's still possible that > > > > Yes, I think we need more sched hooks here as well. Having only > > sched_fork() makes things sort of gross. We'll have to hook this all up > > later. > > I'll try get it hooked up "sooner rather than later". > I think you can make 1:1 threads in the current system by doing: > > kse_create(mbox, NEWGROUP); where the mbox points to the function you > want to run and a new stack. The function just runs as normal, not > knowing that it is atually a UTS thread. Since it never yields to > another thread (by KSE terms) it never does any upcalls an voila.. 1:1 > threads. (I am sugesting that we don't need a new syscall to do this, > or, at most a new entrypoint which ends up calling much of the same > code.) Right. And if you translate this into the M:N library, you just create your threads with PTHREAD_SCOPE_SYSTEM. One of my unvoiced thoughts was that we could add a flag or two to the KSE mailbox so that a scope system thread doesn't need a separate stack. Once one of these KSEs (thread actually) blocks in the kernel, it stays there, BUT, it can still awake from kse_thr_interrupt, kse_release, etc, just that instead of an upcall it just returns normally from those calls. In this way, scope system threads can be very low overhead and not need to enter the UTS scheduler, yet they can still coexist with scope process threads. > Ok but htis breaks things for M:N threads as in M:N threads, teh mask > would be stored "per process" (or at most per group) and the mask is the > "logical OR" of all the masks for the threads in the group/process. > Having a mask per thread and not having one for the bigger unit > means that the masks for the threads must be updated regularly > (maybe at every kernel entry) to be the OR of the masks for ALL THE USER > THREADS, which means that the UTS must do this explicitly. > I'm not thrilled by all the extra work this is going to make for M:N > threads. (Well at least this is my preliminary reading of it.) No, please don't make the UTS deal with this, if that's the case. > > > > First off, it is many months away from being even beta quality. I think > > the UTS is far more complicated than you may realize. There are all sorts > > of synchronization issues that it was able to avoid before since only one > > thread could run at any time and there essentially was no preemption. It > > now also has to deal with effecient scheduling decisions in a M:N model > > that it didn't have to worry about before. > > I'm not sure that teh issues there are as bad as you think. I don't think it is as bad as that either. The complexity is on par with that of libc_r. > > > > Then if you look at the number of system calls and shenanigans a UTS must > > do to make proper scheduling decisions it doesn't look like such an > > advantage. I feel that the overhead of all the layers comes close to the > > savings from doing some of it without entering the kernel. > > So far it's not doing that much.. Yeah, I don't understand the above statement either. It's lower overhead than libc_r. The only system calls it should be making is to kse_release() when it has no more work to do (no runnable threads) or possibly to kse_thr_wakeup() if it has to dispatch signals to threads blocked in the kernel. Time comes from the mailbox so we don't even need to get the time of day. The interfaces were designed so that we _wouldn't_ have much syscall overhead. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 13:04:50 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B2E1037B404 for ; Wed, 26 Mar 2003 13:04:50 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 083A143F75 for ; Wed, 26 Mar 2003 13:04:50 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QL4hBg024111; Wed, 26 Mar 2003 16:04:43 -0500 (EST) Received: from localhost (eischen@localhost)h2QL4gpQ024108; Wed, 26 Mar 2003 16:04:42 -0500 (EST) Date: Wed, 26 Mar 2003 16:04:42 -0500 (EST) From: Daniel Eischen To: Warner Losh In-Reply-To: <200303262030.h2QKU6A7089578@harmony.village.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 21:04:51 -0000 On Wed, 26 Mar 2003, Warner Losh wrote: > In message <3E81F6BB.BFFE3F33@vigrid.com> Daniel Eischen writes: > : Is there a good reason for providing static libraries for > : libpthread/libkse? I'd like to not support them to get > : rid of some hacks to make sure certain symbols are present > : in the static library case. > > That would be a big hassle for the company I work for. We have many > static binaries that are threaded and providing a dynamic one has a > performance impact of a few percent. While we have done dynamic > linking in the past, and have the infrastructure to do so in the > future in our build process, this may cause us problems in the future > if we need to deploy a static binary (which tends to be safer to do > once a long period of time has passed between the generation of the > system and the deployment of the updated binary). > > How gross are the hacks? See libc_r/uthread/uthread_init.c (references[] and libgcc_references[]). Also, in a lot of functions, there are: if (_thread_initial == NULL) _thread_init(); I'd like to be able to get rid of these eventually and perhaps have some magical way of getting it called automatically when the library is loaded. If it was possible, I'm not sure that it would work in both static and shared. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 13:44:54 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D370237B404 for ; Wed, 26 Mar 2003 13:44:54 -0800 (PST) Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id E026A43F85 for ; Wed, 26 Mar 2003 13:44:53 -0800 (PST) (envelope-from marcel@xcllnt.net) Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201]) by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QLiLKu025908; Wed, 26 Mar 2003 13:44:22 -0800 (PST) (envelope-from marcel@piii.pn.xcllnt.net) Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1]) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QLiLBm011898; Wed, 26 Mar 2003 13:44:21 -0800 (PST) (envelope-from marcel@dhcp01.pn.xcllnt.net) Received: (from marcel@localhost) by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QLiLxM011897; Wed, 26 Mar 2003 13:44:21 -0800 (PST) Date: Wed, 26 Mar 2003 13:44:21 -0800 From: Marcel Moolenaar To: Daniel Eischen Message-ID: <20030326214421.GF11320@dhcp01.pn.xcllnt.net> References: <200303262030.h2QKU6A7089578@harmony.village.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.3i X-Spam-Status: No, hits=-31.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Warner Losh Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 21:44:56 -0000 On Wed, Mar 26, 2003 at 04:04:42PM -0500, Daniel Eischen wrote: > > Also, in a lot of functions, there are: > > if (_thread_initial == NULL) > _thread_init(); > > I'd like to be able to get rid of these eventually and perhaps have > some magical way of getting it called automatically when the library > is loaded. You may be able to piggyback on the C++ static object initialization by utilizing _init() and _fini(). I don't think archive is different from shared in that respect for C (ie they both don't have what _init() provides and have _fini() in terms for atexit()). But it works in both cases if you add some C++ related magic (See also the .init and .fini ELF sections). -- Marcel Moolenaar USPA: A-39004 marcel@xcllnt.net From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 13:48:23 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 940E537B404 for ; Wed, 26 Mar 2003 13:48:23 -0800 (PST) Received: from mail1.qc.uunet.ca (mail1.qc.uunet.ca [198.168.54.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id B771943F3F for ; Wed, 26 Mar 2003 13:48:22 -0800 (PST) (envelope-from anarcat@espresso-com.com) Received: from xtanbul.studio.espresso-com.com ([216.94.147.57]) by mail1.qc.uunet.ca (8.12.8/8.12.8) with ESMTP id h2QLm9GS019177; Wed, 26 Mar 2003 16:48:10 -0500 Received: from anarcat by xtanbul.studio.espresso-com.com with local (Exim 3.36 #1 (Debian)) id 18yIkj-0001cu-00; Wed, 26 Mar 2003 16:48:09 -0500 Date: Wed, 26 Mar 2003 16:48:09 -0500 From: The Anarcat To: Dan Nelson Message-ID: <20030326214809.GE488@xtanbul> References: <20030326193524.GA11320@dhcp01.pn.xcllnt.net> <20030326195107.GB31787@dan.emsphone.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20030326195107.GB31787@dan.emsphone.com> User-Agent: Mutt/1.5.3i Sender: The Anarcat X-Spam-Status: No, hits=-32.5 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Daniel Eischen cc: Marcel Moolenaar Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 21:48:24 -0000 On mer mar 26, 2003 at 01:51:07 -0600, Dan Nelson wrote: > In the last episode (Mar 26), Daniel Eischen said: > > On Wed, 26 Mar 2003, Marcel Moolenaar wrote: > > > For example, the access sequences generated by compilers for > > > variables that have the __thread attribute do really suck for when > > > code is to be generated for dynamic linking. The access sequences > > > in the static case are superior. The performance gain is > > > significant if one can build a complete multi-threaded application. > > > > Solaris and IRIX don't seem to provide static thread libraries. Does > > anyone know if Linux does? > > Debian provides static versions: > -rw-r--r-- 1 root root 81959 Feb 25 07:46 /lib/libpthread-0.10.so > -rw-r--r-- 1 root root 97286 Feb 25 07:47 /usr/lib/libpthread.a Note that libpthread.a is provided by the libc6-dev package and does not need to be installed by default, IIRC. anarcat@xtanbul[/usr/lib]% dpkg-query -S libpthread.a libc6-dev: /usr/lib/libpthread.a anarcat@xtanbul[/usr/lib]% Also, this package is not required by most applications. Only when you install build tools, does the static lib gets installed. I like the idea of splitting a port's library between static and shared packages. Most end-users that don't need to compile anything don't need static libraries. -dev packages also contain the header files. I'd like to see the same in our ports system. A. From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 14:02:18 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0AF5037B404 for ; Wed, 26 Mar 2003 14:02:18 -0800 (PST) Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4224B43F93 for ; Wed, 26 Mar 2003 14:02:17 -0800 (PST) (envelope-from imp@bsdimp.com) Received: from localhost (warner@rover2.village.org [10.0.0.1]) by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2QM2BA7090076; Wed, 26 Mar 2003 15:02:11 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Wed, 26 Mar 2003 15:01:52 -0700 (MST) Message-Id: <20030326.150152.125002089.imp@bsdimp.com> To: arch@freebsd.org, kse@elischer.org From: "M. Warner Losh" In-Reply-To: <20030326214421.GF11320@dhcp01.pn.xcllnt.net> References: <200303262030.h2QKU6A7089578@harmony.village.org> <20030326214421.GF11320@dhcp01.pn.xcllnt.net> X-Mailer: Mew version 2.1 on Emacs 21.2 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-9.9 required=5.0 tests=AWL,IN_REP_TO,REFERENCES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 22:02:20 -0000 In message: <20030326214421.GF11320@dhcp01.pn.xcllnt.net> Marcel Moolenaar writes: : On Wed, Mar 26, 2003 at 04:04:42PM -0500, Daniel Eischen wrote: : > : > Also, in a lot of functions, there are: : > : > if (_thread_initial == NULL) : > _thread_init(); : > : > I'd like to be able to get rid of these eventually and perhaps have : > some magical way of getting it called automatically when the library : > is loaded. : : You may be able to piggyback on the C++ static object initialization : by utilizing _init() and _fini(). I don't think archive is different : from shared in that respect for C (ie they both don't have what _init() : provides and have _fini() in terms for atexit()). But it works in both : cases if you add some C++ related magic (See also the .init and .fini : ELF sections). Yes. I was going to make that same point. C++ static object init always happens, static or dynamic. And has since FreeBSD has supported ELF... Warner From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 14:22:01 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CAD4937B404 for ; Wed, 26 Mar 2003 14:22:01 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 29C0143FB1 for ; Wed, 26 Mar 2003 14:22:01 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QMLoBg004177; Wed, 26 Mar 2003 17:21:50 -0500 (EST) Received: from localhost (eischen@localhost)h2QMLo1n004174; Wed, 26 Mar 2003 17:21:50 -0500 (EST) Date: Wed, 26 Mar 2003 17:21:50 -0500 (EST) From: Daniel Eischen To: "M. Warner Losh" In-Reply-To: <20030326.150152.125002089.imp@bsdimp.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 26 Mar 2003 22:22:02 -0000 On Wed, 26 Mar 2003, M. Warner Losh wrote: > In message: <20030326214421.GF11320@dhcp01.pn.xcllnt.net> > Marcel Moolenaar writes: > : On Wed, Mar 26, 2003 at 04:04:42PM -0500, Daniel Eischen wrote: > : > > : > Also, in a lot of functions, there are: > : > > : > if (_thread_initial == NULL) > : > _thread_init(); > : > > : > I'd like to be able to get rid of these eventually and perhaps have > : > some magical way of getting it called automatically when the library > : > is loaded. > : > : You may be able to piggyback on the C++ static object initialization > : by utilizing _init() and _fini(). I don't think archive is different > : from shared in that respect for C (ie they both don't have what _init() > : provides and have _fini() in terms for atexit()). But it works in both > : cases if you add some C++ related magic (See also the .init and .fini > : ELF sections). > > Yes. I was going to make that same point. C++ static object init > always happens, static or dynamic. And has since FreeBSD has > supported ELF... OK, since there seems to be some objections, I'll withdraw the proposition. Other reasons may develop later on, but I'll shelve the idea for now. Thanks for everyone's input :) -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 17:04:18 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BEA1237B404; Wed, 26 Mar 2003 17:04:18 -0800 (PST) Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id D173F43F93; Wed, 26 Mar 2003 17:04:17 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329); Wed, 26 Mar 2003 17:04:17 -0800 From: Wes Peters Organization: Softweyr.com To: John Baldwin Date: Wed, 26 Mar 2003 17:04:17 -0800 User-Agent: KMail/1.5 References: In-Reply-To: X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303261704.17095.wes@softweyr.com> X-OriginalArrivalTime: 27 Mar 2003 01:04:17.0489 (UTC) FILETIME=[CA861C10:01C2F3FC] X-Spam-Status: No, hits=-25.4 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: Poul-Henning Kamp cc: freebsd-arch@freebsd.org Subject: Re: Patch to protect process from pageout killing X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 01:04:19 -0000 On Wednesday 26 March 2003 09:13, John Baldwin wrote: > On 26-Mar-2003 Wes Peters wrote: > > On Tuesday 25 March 2003 08:34, John Baldwin wrote: > >> On 25-Mar-2003 Wes Peters wrote: > >> > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote: > >> >> Also, doesn't this result in the flag being inerited with > >> >> fork() and thereby negating the effect you are seeking for > >> >> squid ? > >> > > >> > I looked through all the places in kern_fork.c where p2->p_flag > >> > gets set and didn't see anything that looked like it would > >> > inherit P_PROTECTED from p1->p_flag. Did I miss something? I'm > >> > obviously a bit of a neophyte in this part of the kernel. > >> > >> rlimit's are inherited. However, due to a "feature" bug in your > >> patch, the P_PROTECTED flag doesn't get turned on when the rlimit > >> is inherited in fork1(). > > > > feature bug? If you mean the fact that the setting for P_PROTECTED > > isn't stored in the rlimit, that was intentional. rlimits are > > inherited and I specifically didn't want that behavior, similar to > > p_cpulimit. I still agree resource limits are not an ideal > > interface to use for this, I'll look further. > > I mean that you should be setting P_PROTECTED in fork() based on the > inherited rlimit's since otherwise the value of the rlimit is out of > sync with the P_PROTECTED flag. Hence a bug. However, since non- > inheritance is the desired behavior, it is also a feature, hence > "feature" bug. Ah, actually it would be best to explicitly clear the RLIMIT_PROTECT in the rlimit, except the RLIMIT_PROTECT isn't stored in the rlimit. Eww, that was not good. Problem is, there isn't a generic syscall for munging proc items. As I said, it was a less-than-optimal syscall to abuse, I'll go back to pondering madvise(2) or mprotect(2) which almost sort of make sense. -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 17:25:17 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9DB6E37B404 for ; Wed, 26 Mar 2003 17:25:17 -0800 (PST) Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com [218.97.164.167]) by mx1.FreeBSD.org (Postfix) with ESMTP id C07ED43F75 for ; Wed, 26 Mar 2003 17:25:14 -0800 (PST) (envelope-from davidxu@freebsd.org) Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id HLDQN88H; Thu, 27 Mar 2003 09:11:31 +0800 Message-ID: <006a01c2f3ff$e57cb300$f001a8c0@davidw2k> From: "David Xu" To: "Jeff Roberson" Date: Thu, 27 Mar 2003 09:26:31 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 X-Spam-Status: No, hits=-6.6 required=5.0 tests=AWL,QUOTED_EMAIL_TEXT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 01:25:20 -0000 After reading your 1:1 threading code, I think you needn't hack current KSE code to build your own 1:1 threading code. Our code allow you to do this, actully, it's my earlier idea to let 1:1 be implemented in our M:N code base, but never had told this to julian or others. if you want to create a thread, you can always call kse_create syscall(the name should be changed to another, for example upcall_create), the newly created upcall will be scheduled a kernel thread on it and return to userland thread stack and thread function, if you want to implement 1:1, you can always set kse_mailbox.km_curthread to NULL, this ensures that userland stack always has fixed association with kernel thread stack, for me, 1:1 only means userland stack has fixed association with kernel thread stack, no more. however, code in kse_create should be adjusted to allow NUPCALLS > NCPUS, this allows 1:1 mode to be implemented. thr_exit can be implemented to use kse_exit, and maybe a wrapper to for kse_exit. For thr_kill, I think we may add an API to allow a kernel thread be identified by using kse_mailbox pointer or something similar. By implementing 1:1 code in current M:N code base, benifit is very clear, a ksegrp protects time quantum and threads priority, and if you want to implement a system scope thread(I know pthread has this requirement), just call kse_create with newgroup parameters is 1, you will immediately get a system scope thread. Yes, you may think that KSE progress is slow, but I'd like to think harder before pushing some not well thinked code into kernel. At least, when I am thinking about M:N, I am also thinking about 1:1, I guess some people like 1:1, and others like M:N, so more choise = is good. David Xu ----- Original Message -----=20 From: "Jeff Roberson" To: Sent: Wednesday, March 26, 2003 11:00 AM Subject: 1:1 threading. > I just sent a mail to arch@ about a parallel effort that you all may = be > interested in. Please follow up there. >=20 > Thanks, > Jeff From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 18:00:30 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D5BA237B404 for ; Wed, 26 Mar 2003 18:00:30 -0800 (PST) Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com [218.97.164.167]) by mx1.FreeBSD.org (Postfix) with ESMTP id A86D643F3F for ; Wed, 26 Mar 2003 18:00:29 -0800 (PST) (envelope-from davidxu@freebsd.org) Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id HLDQN9BF; Thu, 27 Mar 2003 09:46:42 +0800 Message-ID: <001d01c2f404$cffb3ab0$f001a8c0@davidw2k> From: "David Xu" To: "Daniel Eischen" , References: <3E81F6BB.BFFE3F33@vigrid.com> Date: Thu, 27 Mar 2003 10:01:42 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 X-Spam-Status: No, hits=-8.8 required=5.0 tests=AWL,QUOTED_EMAIL_TEXT,REFERENCES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: kse@elischer.org Subject: Re: Not providing static libraries (libkse/libpthread) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 02:00:31 -0000 I'd like to see everything is dynamically linked and threaded :-). David Xu ----- Original Message -----=20 From: "Daniel Eischen" To: Cc: Sent: Thursday, March 27, 2003 2:51 AM Subject: Not providing static libraries (libkse/libpthread) > Is there a good reason for providing static libraries for > libpthread/libkse? I'd like to not support them to get > rid of some hacks to make sure certain symbols are present > in the static library case. >=20 > --=20 > Dan Eischen > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to = "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 23:17:27 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A76E037B404; Wed, 26 Mar 2003 23:17:27 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9193B43FA3; Wed, 26 Mar 2003 23:17:26 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2R7HPl97392; Thu, 27 Mar 2003 02:17:25 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Thu, 27 Mar 2003 02:17:25 -0500 (EST) From: Jeff Roberson To: David Xu In-Reply-To: <006a01c2f3ff$e57cb300$f001a8c0@davidw2k> Message-ID: <20030327020402.T64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-17.0 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 07:17:29 -0000 On Thu, 27 Mar 2003, David Xu wrote: > After reading your 1:1 threading code, I think you needn't > hack current KSE code to build your own 1:1 threading code. > Our code allow you to do this, actully, it's my earlier > idea to let 1:1 be implemented in our M:N code base, but never > had told this to julian or others. It was actually done outside of KSE on purpose. It keeps the API simpler and cleaner. It keeps the implementation cleaner. It keeps it out of the majority of the KSE code paths aside from thread_suspend and related code. I wanted something small and stable that built on top of KSE provided primitives but did not actually use the KSE apis. This makes it easier for KSE to continue growing and changing while the 1:1 code remains simple. It also removes some of the cost associated with doing KSE. > if you want to create a thread, you can always call kse_create > syscall(the name should be changed to another, for example > upcall_create), the newly created upcall will be scheduled a > kernel thread on it and return to userland thread stack and > thread function, if you want to implement 1:1, you can always > set kse_mailbox.km_curthread to NULL, this ensures that userland > stack always has fixed association with kernel thread stack, > for me, 1:1 only means userland stack has fixed association with > kernel thread stack, no more. however, code in kse_create should be > adjusted to allow NUPCALLS > NCPUS, this allows 1:1 mode to be > implemented. thr_exit can be implemented to use kse_exit, and maybe > a wrapper to for kse_exit. > For thr_kill, I think we may add an API to allow a kernel > thread be identified by using kse_mailbox pointer or something > similar. I intend to keep thr_kill as is since it is the most simple and direct way to acomplish the POSIX semantics. > By implementing 1:1 code in current M:N code base, benifit is very > clear, a ksegrp protects time quantum and threads priority, and if > you want to implement a system scope thread(I know pthread has this > requirement), just call kse_create with newgroup parameters is 1, > you will immediately get a system scope thread. I have considered using the ksegrp for this purpose but I view that as an advanced feature not required to get threading off the ground. I'm trying to take reasonable steps that provide functionality all along the way. > Yes, you may think that KSE progress is slow, but I'd like to think KSE progress has been far too slow. Many people are migrating away from freebsd for other platforms due to our lack of threading. This project has been underway for a significant amount of time. I did the 1:1 threading because I view this as the most reasonable way to get good threading in a short period of time. > harder before pushing some not well thinked code into kernel. It is quite well thought out. Given the track record of the KSE project I actually take some offense to the suggestion that my code is not well thought out in comparison. > At least, when I am thinking about M:N, I am also thinking about > 1:1, I guess some people like 1:1, and others like M:N, so more choise is good. > > David Xu Yes, I think they both have their place. I think we can support both in the tree as well. Hopefully the 1:1 code will address most users needs until M:N is production ready. Cheers, Jeff > ----- Original Message ----- > From: "Jeff Roberson" > To: > Sent: Wednesday, March 26, 2003 11:00 AM > Subject: 1:1 threading. > > > > I just sent a mail to arch@ about a parallel effort that you all may be > > interested in. Please follow up there. > > > > Thanks, > > Jeff > From owner-freebsd-arch@FreeBSD.ORG Wed Mar 26 23:43:55 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7756937B404 for ; Wed, 26 Mar 2003 23:43:55 -0800 (PST) Received: from cirb503493.alcatel.com.au (c18609.belrs1.nsw.optusnet.com.au [210.49.80.204]) by mx1.FreeBSD.org (Postfix) with ESMTP id 37BBA43F85 for ; Wed, 26 Mar 2003 23:43:54 -0800 (PST) (envelope-from peterjeremy@optushome.com.au) Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au [127.0.0.1])h2R7hEM2019016; Thu, 27 Mar 2003 18:43:15 +1100 (EST) (envelope-from jeremyp@cirb503493.alcatel.com.au) Received: (from jeremyp@localhost) by cirb503493.alcatel.com.au (8.12.8/8.12.8/Submit) id h2R7hBXb019015; Thu, 27 Mar 2003 18:43:11 +1100 (EST) Date: Thu, 27 Mar 2003 18:43:11 +1100 From: Peter Jeremy To: Jeff Roberson Message-ID: <20030327074311.GB18940@cirb503493.alcatel.com.au> References: <20030326031245.O64602-100000@mail.chesapeake.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.1i X-Spam-Status: No, hits=-30.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: kse@elischer.org cc: Julian Elischer Subject: Re: 1:1 Threading implementation. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 07:43:58 -0000 On Wed, Mar 26, 2003 at 12:30:52PM -0800, Julian Elischer wrote: >On Wed, 26 Mar 2003, Jeff Roberson wrote: >> First, if your application has more threads than cpus it is written >> incorrectly. > >Not neccesarily. that's just one way of looking at threads. Active >component threaded programs use threads as a programming model >(see above) and it is a perfectly valid way of writing a program. I'd go so far as to say that the only case where relating real CPUs and threads matters is for compute-bound processes where the only purpose of threading is to get >100% CPU. If you consider an arbitrary server/daemon process, there are a limited number of basic mechanisms you can use to handle more than one client: 1) One (single-threaded) process per client (eg telnetd, sshd) 2) One process with one thread per client (possibly per direction) 3) One process explicitly using select()[*] to support multiple clients. Each approach has its own advantages and disadvantages and each approach requires different support code to handle new clients and switching between clients. Obviously, you can combine the approaches but this means you have the support infrastructure for both basic mechanisms as well as additional code to decide which mechanism to use. Apache is a combination of 1 and 3 - but needs a process dedicated to distributing incoming requests. In general, if you're going to go the effort of threading your server, why go to the additional effort of adding a select() handler in each thread? The big advantage of 1 and 2 is that the core is very simple: while (!eof(input)) { read input do some processing write output } whereas the core of 3 requires building and testing FD sets and making sure that you only block in the select(). This generally makes the code far less clear. You can also potentially reduce the overall throughput because there are multiple scheduling layers. [*] For "select()", read "select() or poll() or kqueue()" Peter From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 00:35:49 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E695C37B401; Thu, 27 Mar 2003 00:35:49 -0800 (PST) Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net [207.217.120.189]) by mx1.FreeBSD.org (Postfix) with ESMTP id 42B4C43FBF; Thu, 27 Mar 2003 00:35:49 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0033.cvx22-bradley.dialup.earthlink.net ([209.179.198.33] helo=mindspring.com) by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18ySrR-0006tb-00; Thu, 27 Mar 2003 00:35:46 -0800 Message-ID: <3E82B795.DDB0C6A4@mindspring.com> Date: Thu, 27 Mar 2003 00:34:29 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Jeff Roberson References: <20030327020402.T64602-100000@mail.chesapeake.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a48c4399f109637f76819f1b3d16d0a6c2a8438e0f32a48e08350badd9bab72f9c350badd9bab72f9c X-Spam-Status: No, hits=-21.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,QUOTE_TWICE_1, RCVD_IN_OSIRUSOFT_COM,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 08:36:01 -0000 Jeff Roberson wrote: > On Thu, 27 Mar 2003, David Xu wrote: > > After reading your 1:1 threading code, I think you needn't > > hack current KSE code to build your own 1:1 threading code. > > Our code allow you to do this, actully, it's my earlier > > idea to let 1:1 be implemented in our M:N code base, but never > > had told this to julian or others. > > It was actually done outside of KSE on purpose. It keeps the API simpler > and cleaner. It keeps the implementation cleaner. It keeps it out of the > majority of the KSE code paths aside from thread_suspend and related > code. > > I wanted something small and stable that built on top of KSE provided > primitives but did not actually use the KSE apis. This makes it easier > for KSE to continue growing and changing while the 1:1 code remains > simple. It also removes some of the cost associated with doing KSE. This isn't really a legitimate argument. Specifically, if the primitives are incapable of supporting your model, then the primitives need to be changed. The main problem that needs to be overcome, in most cases, is that historical designs preclude future work. In this case, your code represents "future work" unanticipated by the previous design. Note that I do not necessarily take this position myself; I think that your work could have proceedded with the KSEGRP per KSE approach, as Julian and David have suggested, rather than the single KSEGRP per process approach you chose. > I intend to keep thr_kill as is since it is the most simple and direct way > to acomplish the POSIX semantics. "POSIX semantics" should always be considered a secondary consideration, since the real intent is to allow *all* semantics as a construction of the available semantics. This is incredibly difficult, I know, but it's why smart people win over stupid people, or average people, when it comes to design issues. We need to take the "best of breed" forward, and exclude the rest (a genetic algorithm, but the best we can approximate at this time). > I have considered using the ksegrp for this purpose but I view that as an > advanced feature not required to get threading off the ground. I'm trying > to take reasonable steps that provide functionality all along the way. This is expediency. Expediency really has no place in Open Source design, since it doesn't really consider the consumers at all, it considers (or is supposed to consider) only the problem space we are talking about itself. That really changes at the whim of public opinion. > > Yes, you may think that KSE progress is slow, but I'd like to think > > KSE progress has been far too slow. Yes. This has to do with inefficiencies of mapping volunteerism to what is considered (by some people) as "the right way". There's really no approach to resolving this (right now) other than "let the best implementation win". If I'm only willing to work on what I consider "more ideal" to mapping the problem space, then your volunteers are not willing to be "managed" into specific implementation details. -- Hopefully, in the future, this will change: we all want to live in an ideal world. At that point, it comes down to games theory, in terms of communicating goals, and information theory, in terms of communicating with other people about the desirability of specific goals. It's easy to cast this in terms of "war games", "mutual security games", or "present politics of public opinon". All you need is a meta-perspective on the problems. Personally, I'm happy to see forward progress without conflict; in the present tense, this means that the KSE work gets used, even if the use is not in line with the eventual design goals. If this takes a 1:1 implementation to keep it from being diked out, I don't care: it's a "drunkards walk" toward the final goal, and so people should shut up about it, since, no matter how you look at it, there's a net positive value to the work. As such, I would like the KSEGRP per thread, instead of the KSEGRP per process code to go forward. -- Terry From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 01:17:09 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8C94637B438 for ; Thu, 27 Mar 2003 01:17:06 -0800 (PST) Received: from mail.nsu.ru (mx.nsu.ru [193.124.215.71]) by mx1.FreeBSD.org (Postfix) with ESMTP id EB90A440E7 for ; Thu, 27 Mar 2003 01:05:20 -0800 (PST) (envelope-from fjoe@iclub.nsu.ru) Received: from drweb by mail.nsu.ru with drweb-scanned (Exim 3.20 #1) id 18yTJY-0008Up-00; Thu, 27 Mar 2003 15:04:48 +0600 Received: from iclub.nsu.ru ([193.124.215.97] ident=root) by mail.nsu.ru with esmtp (Exim 3.20 #1) id 18yTJW-0008SZ-00; Thu, 27 Mar 2003 15:04:46 +0600 Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1]) by iclub.nsu.ru (8.12.8/8.12.8) with ESMTP id h2R93Hj1009140; Thu, 27 Mar 2003 15:03:17 +0600 (NS) (envelope-from fjoe@iclub.nsu.ru) Received: (from fjoe@localhost) by iclub.nsu.ru (8.12.8/8.12.8/Submit) id h2R93Eeo009136; Thu, 27 Mar 2003 15:03:15 +0600 (NS) Date: Thu, 27 Mar 2003 15:03:14 +0600 From: Max Khon To: Terry Lambert Message-ID: <20030327150313.A8897@iclub.nsu.ru> References: <20030327020402.T64602-100000@mail.chesapeake.net> <3E82B795.DDB0C6A4@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3E82B795.DDB0C6A4@mindspring.com>; from tlambert2@mindspring.com on Thu, Mar 27, 2003 at 12:34:29AM -0800 X-Envelope-To: tlambert2@mindspring.com, jroberson@chesapeake.net, arch@freebsd.org X-Spam-Status: No, hits=-33.1 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 09:17:22 -0000 hi, there! On Thu, Mar 27, 2003 at 12:34:29AM -0800, Terry Lambert wrote: > > > After reading your 1:1 threading code, I think you needn't > > > hack current KSE code to build your own 1:1 threading code. > > > Our code allow you to do this, actully, it's my earlier > > > idea to let 1:1 be implemented in our M:N code base, but never > > > had told this to julian or others. > > > > It was actually done outside of KSE on purpose. It keeps the API simpler > > and cleaner. It keeps the implementation cleaner. It keeps it out of the > > majority of the KSE code paths aside from thread_suspend and related > > code. > > > > I wanted something small and stable that built on top of KSE provided > > primitives but did not actually use the KSE apis. This makes it easier > > for KSE to continue growing and changing while the 1:1 code remains > > simple. It also removes some of the cost associated with doing KSE. > > This isn't really a legitimate argument. Seconded. do you have numbers that clearly show that using Julian's approach leads to serious performance penalty? Using KSE APIs is not that difficult as far as I understand, so why we need to introduce more hacks? /fjoe From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 02:24:17 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BD5CB37B404 for ; Thu, 27 Mar 2003 02:24:17 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0586B43FB1 for ; Thu, 27 Mar 2003 02:24:17 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2RAOGZ78047 for ; Thu, 27 Mar 2003 05:24:16 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Thu, 27 Mar 2003 05:24:16 -0500 (EST) From: Jeff Roberson To: arch@freebsd.org Message-ID: <20030327052055.X64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-7.8 required=5.0 tests=AWL version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Threading code review please. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 10:24:18 -0000 I'm going to reply to the threads on 1:1 vs M:N tomorrow. I'd like to request that people actually read the patch and give me feedback on the code and not the approach. I have no outstanding behavior problems with mozilla. It actually runs much faster now with libthr in place of libc_r. On pages with LOTS of images it scrolls much smoother. I suspect its the amount of io waits. Anyway, since this is coming together so well I'd like to get it commited soon so people can start giving me bug reports. I have another full days worth of work to clear up the issues that I know of. I'll probably post the library source tomorrow. Cheers, Jeff From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 05:09:13 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 027A537B401 for ; Thu, 27 Mar 2003 05:09:13 -0800 (PST) Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5540943FCB for ; Thu, 27 Mar 2003 05:09:10 -0800 (PST) (envelope-from dcs@tcoip.com.br) Received: from tcoip.com.br ([10.0.2.6]) by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2RD93911387; Thu, 27 Mar 2003 10:09:03 -0300 Message-ID: <3E82F7EE.6080802@tcoip.com.br> Date: Thu, 27 Mar 2003 10:09:02 -0300 From: "Daniel C. Sobral" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326 X-Accept-Language: en-us, en, pt-br, ja MIME-Version: 1.0 To: Jeff Roberson References: <20030327052055.X64602-100000@mail.chesapeake.net> In-Reply-To: <20030327052055.X64602-100000@mail.chesapeake.net> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-31.9 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: Threading code review please. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 13:09:14 -0000 Jeff Roberson wrote: > I'm going to reply to the threads on 1:1 vs M:N tomorrow. I'd like to > request that people actually read the patch and give me feedback on the > code and not the approach. > > I have no outstanding behavior problems with mozilla. It actually runs > much faster now with libthr in place of libc_r. On pages with LOTS of > images it scrolls much smoother. I suspect its the amount of io waits. This is an SMP system you are talking about? -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: Daniel.Capo@tco.net.br Daniel.Sobral@tcoip.com.br dcs@tcoip.com.br Outros: dcs@newsguy.com dcs@freebsd.org capo@notorious.bsdconspiracy.net "I'd love to go out with you, but the last time I went out, I never came back." From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 05:13:42 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 02FF737B401 for ; Thu, 27 Mar 2003 05:13:42 -0800 (PST) Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id DC0F843FA3 for ; Thu, 27 Mar 2003 05:13:38 -0800 (PST) (envelope-from dcs@tcoip.com.br) Received: from tcoip.com.br ([10.0.2.6]) by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2RD7i911343; Thu, 27 Mar 2003 10:07:44 -0300 Message-ID: <3E82F7A0.2020604@tcoip.com.br> Date: Thu, 27 Mar 2003 10:07:44 -0300 From: "Daniel C. Sobral" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326 X-Accept-Language: en-us, en, pt-br, ja MIME-Version: 1.0 To: Max Khon References: <20030327020402.T64602-100000@mail.chesapeake.net> <3E82B795.DDB0C6A4@mindspring.com> <20030327150313.A8897@iclub.nsu.ru> In-Reply-To: <20030327150313.A8897@iclub.nsu.ru> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-31.9 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 13:13:43 -0000 Max Khon wrote: > hi, there! > > On Thu, Mar 27, 2003 at 12:34:29AM -0800, Terry Lambert wrote: > > >>>>After reading your 1:1 threading code, I think you needn't >>>>hack current KSE code to build your own 1:1 threading code. >>>>Our code allow you to do this, actully, it's my earlier >>>>idea to let 1:1 be implemented in our M:N code base, but never >>>>had told this to julian or others. >>> >>>It was actually done outside of KSE on purpose. It keeps the API simpler >>>and cleaner. It keeps the implementation cleaner. It keeps it out of the >>>majority of the KSE code paths aside from thread_suspend and related >>>code. >>> >>>I wanted something small and stable that built on top of KSE provided >>>primitives but did not actually use the KSE apis. This makes it easier >>>for KSE to continue growing and changing while the 1:1 code remains >>>simple. It also removes some of the cost associated with doing KSE. >> >>This isn't really a legitimate argument. > > > Seconded. do you have numbers that clearly show that using Julian's approach > leads to serious performance penalty? Using KSE APIs is not that difficult > as far as I understand, so why we need to introduce more hacks? As much as I'd prefer the 1:1 threading to use as much of the KSE code as possible, Jeff's decision wasn't related to performance issues. What Jeff wanted to do is to _avoid_ using as much of the KSE API as possible so his code wouldn't get in the way of that API, with two obvious benefits: 1) Changes to that API (and there have been some in the past) won't affect his 1:1 threading code and, thus, won't upset real applications using that threading. 2) His 1:1 threading code won't slow down further KSE development nor influence any changes to the KSE API. The reason I personally prefer otherwise is so that (1) above won't be true. Ie, any bugs or performance issues introduced in the KSE code *will* affect real applications, so that they can be detected and fixed. -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: Daniel.Capo@tco.net.br Daniel.Sobral@tcoip.com.br dcs@tcoip.com.br Outros: dcs@newsguy.com dcs@freebsd.org capo@notorious.bsdconspiracy.net A lady stockholder quite hetera Decided her fortune to bettera: On the floor, quite unclad, She successively had Merrill Lynch, Pierce, Fenner, et cetera... From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 06:05:34 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id E08A137B401 for ; Thu, 27 Mar 2003 06:05:34 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3595443FBD for ; Thu, 27 Mar 2003 06:05:34 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RE5VBg004917; Thu, 27 Mar 2003 09:05:31 -0500 (EST) Received: from localhost (eischen@localhost)h2RE5UFB004914; Thu, 27 Mar 2003 09:05:30 -0500 (EST) Date: Thu, 27 Mar 2003 09:05:30 -0500 (EST) From: Daniel Eischen To: Jeff Roberson In-Reply-To: <20030327052055.X64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: Threading code review please. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 14:05:35 -0000 On Thu, 27 Mar 2003, Jeff Roberson wrote: > I'm going to reply to the threads on 1:1 vs M:N tomorrow. I'd like to > request that people actually read the patch and give me feedback on the > code and not the approach. As was said by others, I think you can do what you want with the existing APIs. I don't see a need for adding more. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 07:04:00 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 36D6137B401 for ; Thu, 27 Mar 2003 07:04:00 -0800 (PST) Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by mx1.FreeBSD.org (Postfix) with ESMTP id 68F1F43FAF for ; Thu, 27 Mar 2003 07:03:59 -0800 (PST) (envelope-from robert@fledge.watson.org) Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.12.8/8.12.8) with SMTP id h2RF3ojK066871; Thu, 27 Mar 2003 10:03:50 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Thu, 27 Mar 2003 10:03:50 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: "Daniel C. Sobral" In-Reply-To: <3E82F7EE.6080802@tcoip.com.br> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-23.5 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: Threading code review please. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 15:04:01 -0000 On Thu, 27 Mar 2003, Daniel C. Sobral wrote: > Jeff Roberson wrote: > > I'm going to reply to the threads on 1:1 vs M:N tomorrow. I'd like to > > request that people actually read the patch and give me feedback on the > > code and not the approach. > > > > I have no outstanding behavior problems with mozilla. It actually runs > > much faster now with libthr in place of libc_r. On pages with LOTS of > > images it scrolls much smoother. I suspect its the amount of io waits. > > This is an SMP system you are talking about? Both 1:1 and M:N threading will improve performance of interactive applications if they spend any moderate amount of time I/O bound. I've noticed substantial performance differences between instances of openoffice linked for libc_r and openoffice linked for linuxthreads -- serializing I/O operations substantially impacts throughput and interactivty due to latency. Try running the Linux-linked mozilla, the FreeBSD libc_r mozilla, and the FreeBSD linuxthreads mozilla and see how they compare. Robert N M Watson FreeBSD Core Team, TrustedBSD Projects robert@fledge.watson.org Network Associates Laboratories From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 07:30:09 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F258437B401 for ; Thu, 27 Mar 2003 07:30:08 -0800 (PST) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4B4CC43FDD for ; Thu, 27 Mar 2003 07:30:08 -0800 (PST) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h2RFSKl28011; Thu, 27 Mar 2003 07:28:20 -0800 Received: from btc.btc.adaptec.com (btc.btc.adaptec.com [10.100.0.52]) by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id HAA01187; Thu, 27 Mar 2003 07:28:59 -0800 (PST) Received: from btc.adaptec.com (hollin [10.100.253.56]) by btc.btc.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id IAA09159; Thu, 27 Mar 2003 08:28:51 -0700 (MST) Message-ID: <3E8318B3.2020801@btc.adaptec.com> Date: Thu, 27 Mar 2003 08:28:51 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2.1) Gecko/20030206 X-Accept-Language: en-us, en MIME-Version: 1.0 To: "Daniel C. Sobral" References: <20030327020402.T64602-100000@mail.chesapeake.net> <3E82B795.DDB0C6A4@mindspring.com> <20030327150313.A8897@iclub.nsu.ru> <3E82F7A0.2020604@tcoip.com.br> In-Reply-To: <3E82F7A0.2020604@tcoip.com.br> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-31.9 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 15:30:10 -0000 Daniel C. Sobral wrote: > Max Khon wrote: > >> hi, there! >> >> On Thu, Mar 27, 2003 at 12:34:29AM -0800, Terry Lambert wrote: >> >> >>>>> After reading your 1:1 threading code, I think you needn't >>>>> hack current KSE code to build your own 1:1 threading code. >>>>> Our code allow you to do this, actully, it's my earlier >>>>> idea to let 1:1 be implemented in our M:N code base, but never >>>>> had told this to julian or others. >>>> >>>> >>>> It was actually done outside of KSE on purpose. It keeps the API >>>> simpler >>>> and cleaner. It keeps the implementation cleaner. It keeps it out >>>> of the >>>> majority of the KSE code paths aside from thread_suspend and related >>>> code. >>>> >>>> I wanted something small and stable that built on top of KSE provided >>>> primitives but did not actually use the KSE apis. This makes it easier >>>> for KSE to continue growing and changing while the 1:1 code remains >>>> simple. It also removes some of the cost associated with doing KSE. >>> >>> >>> This isn't really a legitimate argument. >> >> >> >> Seconded. do you have numbers that clearly show that using Julian's >> approach >> leads to serious performance penalty? Using KSE APIs is not that >> difficult >> as far as I understand, so why we need to introduce more hacks? > > > As much as I'd prefer the 1:1 threading to use as much of the KSE code > as possible, Jeff's decision wasn't related to performance issues. > > What Jeff wanted to do is to _avoid_ using as much of the KSE API as > possible so his code wouldn't get in the way of that API, with two > obvious benefits: > > 1) Changes to that API (and there have been some in the past) won't > affect his 1:1 threading code and, thus, won't upset real applications > using that threading. > > 2) His 1:1 threading code won't slow down further KSE development nor > influence any changes to the KSE API. > > The reason I personally prefer otherwise is so that (1) above won't be > true. Ie, any bugs or performance issues introduced in the KSE code > *will* affect real applications, so that they can be detected and fixed. > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE development. By keeping the 1:1 and M:N API's separate, KSE can progress in 6-CURRENT until it is proven while still allowing MFC's to 5-STABLE to happen without too much pain. Later on down the road when KSE matures, or when we decide that 1:1 should really just be a special case of M:N, we can look at addressing the above concerns and possibly MFC'ing the results back to 5-STABLE. But for now we need to allow for 5-STABLE to actually be usable and maintainable. Scott From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 08:12:15 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 390C137B404 for ; Thu, 27 Mar 2003 08:12:15 -0800 (PST) Received: from h132-197-179-27.gte.com (h132-197-179-27.gte.com [132.197.179.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0F2AF43F93 for ; Thu, 27 Mar 2003 08:12:14 -0800 (PST) (envelope-from ak03@gte.com) Received: from kanpc.gte.com (ak03@localhost [127.0.0.1]) h2RGCCAi036101; Thu, 27 Mar 2003 11:12:12 -0500 (EST) (envelope-from ak03@kanpc.gte.com) Received: (from ak03@localhost) by kanpc.gte.com (8.12.8/8.12.8/Submit) id h2RGCCXq036100; Thu, 27 Mar 2003 11:12:12 -0500 (EST) Date: Thu, 27 Mar 2003 11:12:12 -0500 From: Alexander Kabaev To: Max Khon Message-Id: <20030327111212.13029dbf.ak03@gte.com> In-Reply-To: <20030327150313.A8897@iclub.nsu.ru> References: <20030327020402.T64602-100000@mail.chesapeake.net> <3E82B795.DDB0C6A4@mindspring.com> <20030327150313.A8897@iclub.nsu.ru> Organization: Verizon Data Services X-Mailer: Sylpheed version 0.8.11claws42 (GTK+ 1.2.10; i386-portbld-freebsd5.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-25.4 required=5.0 tests=EMAIL_ATTRIBUTION,FROM_ENDS_IN_NUMS,IN_REP_TO, QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 16:12:16 -0000 On Thu, 27 Mar 2003 15:03:14 +0600 Max Khon wrote: > Seconded. do you have numbers that clearly show that using Julian's > approach leads to serious performance penalty? Using KSE APIs is not > that difficult as far as I understand, so why we need to introduce > more hacks? > Disagreed. Using KSE APIs _is_ difficult. I think one of the ideas behind 1:1 libth is to keep the code as simple as practical and entangling it too strongly with KSE contradicts with that goal. I certainly hope to see M:N threading project to come to completion in the future, but keep in mind that the architecture this complex will certainly take quite some time to mature and having a reliable fallback option is good. If anything it will provide KSE people with something to compare their implementation with. -- Alexander Kabaev From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 08:46:43 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D2A7537B401 for ; Thu, 27 Mar 2003 08:46:43 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 32CA543F75 for ; Thu, 27 Mar 2003 08:46:43 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RGkcBg027581; Thu, 27 Mar 2003 11:46:38 -0500 (EST) Received: from localhost (eischen@localhost)h2RGkcDZ027578; Thu, 27 Mar 2003 11:46:38 -0500 (EST) Date: Thu, 27 Mar 2003 11:46:38 -0500 (EST) From: Daniel Eischen To: Scott Long In-Reply-To: <3E8318B3.2020801@btc.adaptec.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.3 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 16:46:44 -0000 On Thu, 27 Mar 2003, Scott Long wrote: > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE > development. By keeping the 1:1 and M:N API's separate, KSE can > progress in 6-CURRENT until it is proven while still allowing MFC's to > 5-STABLE to happen without too much pain. That's kind of silly; we have other ways to keep API/ABI compatability and have used this for all other syscalls. The KSE and thread mailboxes even have version numbers in them. > Later on down the road when > KSE matures, or when we decide that 1:1 should really just be a special > case of M:N, we can look at addressing the above concerns and possibly > MFC'ing the results back to 5-STABLE. But for now we need to allow for > 5-STABLE to actually be usable and maintainable. The libthr implementation of 1:1 is not what most consider 1:1 -- you don't get a separate quantum and priority for each thread. As such, this library is really no different than libkse. The only real difference is that the UTS chooses the next thread to run instead of the kernel. If you're going to add a bunch of code to both userland (in libthr) and the kernel just to get a working threading library, it seems much easier to just fix libkse so that it works for the single KSE/KSEG case. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 08:59:10 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 646AA37B401 for ; Thu, 27 Mar 2003 08:59:10 -0800 (PST) Received: from boromir.vpop.net (dns1.vpop.net [207.178.248.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id E057843F93 for ; Thu, 27 Mar 2003 08:59:09 -0800 (PST) (envelope-from mreimer@vpop.net) Received: from vpop.net (bilbo.vpop.net [65.103.33.41]) by boromir.vpop.net (Postfix) with ESMTP id 164083A6394 for ; Thu, 27 Mar 2003 08:59:08 -0800 (PST) Message-ID: <3E832E39.7040306@vpop.net> Date: Thu, 27 Mar 2003 11:00:41 -0600 From: Matthew Reimer User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030220 X-Accept-Language: en-us, en MIME-Version: 1.0 To: arch@freebsd.org References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-31.9 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES, REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: Threading code review please. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 16:59:15 -0000 Robert Watson wrote: > > Both 1:1 and M:N threading will improve performance of interactive > applications if they spend any moderate amount of time I/O bound. I've > noticed substantial performance differences between instances of > openoffice linked for libc_r and openoffice linked for linuxthreads -- > serializing I/O operations substantially impacts throughput and > interactivty due to latency. Try running the Linux-linked mozilla, the > FreeBSD libc_r mozilla, and the FreeBSD linuxthreads mozilla and see how > they compare. Where can one find a FreeBSD linuxthreads mozilla? Matt From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 10:05:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3545837B401 for ; Thu, 27 Mar 2003 10:05:40 -0800 (PST) Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id B3EF643FBD for ; Thu, 27 Mar 2003 10:05:29 -0800 (PST) (envelope-from dcs@tcoip.com.br) Received: from tcoip.com.br ([10.0.2.6]) by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2RI5H919221 for ; Thu, 27 Mar 2003 15:05:17 -0300 Message-ID: <3E833D5D.10200@tcoip.com.br> Date: Thu, 27 Mar 2003 15:05:17 -0300 From: "Daniel C. Sobral" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326 X-Accept-Language: en-us, en, pt-br, ja MIME-Version: 1.0 To: arch@freebsd.org Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-18.9 required=5.0 tests=AWL,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: 1-1 threading -- it seems to me... X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 18:05:42 -0000 Well, the RE seems to be firmly behind a production-level 1:1 threading implementation for 5.x, and leaving M:N development to 6.x, to be merged when and if things look promising. Jeff has developed a 1:1 threading which avoids using much of existing KSE API. KSE people would prefer to see a solution with much more integration. Gentlemen, it seems to me this is a classic case of coding speaking louder than words. -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: Daniel.Capo@tco.net.br Daniel.Sobral@tcoip.com.br dcs@tcoip.com.br Outros: dcs@newsguy.com dcs@freebsd.org capo@notorious.bsdconspiracy.net ARMADILLO: To provide weapons to a Spanish pickle. From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 11:42:11 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EAB7E37B401 for ; Thu, 27 Mar 2003 11:42:11 -0800 (PST) Received: from rootlabs.com (root.org [67.118.192.226]) by mx1.FreeBSD.org (Postfix) with SMTP id 1AFF743FBF for ; Thu, 27 Mar 2003 11:42:11 -0800 (PST) (envelope-from nate@rootlabs.com) Received: (qmail 30059 invoked by uid 1000); 27 Mar 2003 19:42:11 -0000 Date: Thu, 27 Mar 2003 11:42:11 -0800 (PST) From: Nate Lawson To: arch@freebsd.org, current@freebsd.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-13.5 required=5.0 tests=AWL,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: 5.x locking plan X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 19:42:13 -0000 My curiousity has overcome my fear of the bikeshed so I'll ask the question that has been bugging me for a while. Why haven't we gone through the tree and created a lock for each spl and then converted every spl call into the appropriate mtx_lock call? At that point, we can mark large sections of the tree giant-free and then make the locking data-based (instead of code-based) one section at a time. This is the approach Solaris took. -Nate From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 11:51:08 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 07EF137B401 for ; Thu, 27 Mar 2003 11:51:08 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2BF6043F75 for ; Thu, 27 Mar 2003 11:51:07 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2RJovA89523; Thu, 27 Mar 2003 14:50:57 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Thu, 27 Mar 2003 14:50:56 -0500 (EST) From: Jeff Roberson To: Daniel Eischen In-Reply-To: Message-ID: <20030327143259.I64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-17.2 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 19:51:09 -0000 On Thu, 27 Mar 2003, Daniel Eischen wrote: > On Thu, 27 Mar 2003, Scott Long wrote: > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE > > development. By keeping the 1:1 and M:N API's separate, KSE can > > progress in 6-CURRENT until it is proven while still allowing MFC's to > > 5-STABLE to happen without too much pain. > > That's kind of silly; we have other ways to keep API/ABI > compatability and have used this for all other syscalls. > The KSE and thread mailboxes even have version numbers > in them. Which means they are likely to change. I do not want to develop on unstable APIs and unstable kernel code. kern_thr.c is 254 lines. I think we can handle a little duplication. I'm not sure why the objection is so strong. > > > Later on down the road when > > KSE matures, or when we decide that 1:1 should really just be a special > > case of M:N, we can look at addressing the above concerns and possibly > > MFC'ing the results back to 5-STABLE. But for now we need to allow for > > 5-STABLE to actually be usable and maintainable. > > The libthr implementation of 1:1 is not what most consider > 1:1 -- you don't get a separate quantum and priority for > each thread. As such, this library is really no different > than libkse. The only real difference is that the UTS > chooses the next thread to run instead of the kernel. > If you're going to add a bunch of code to both userland > (in libthr) and the kernel just to get a working threading > library, it seems much easier to just fix libkse so that > it works for the single KSE/KSEG case. It didn't seem much easier to me. This whole argument about kseg/kse/thread vs kse/thread can be solved very easily by allocating a ksegrp in kern_thr.c I estimate that would add another 10 lines of code. The ksegrp argument is questionable anyway. In both ULE and 4bds each KSE gets its own quantum. The KSEGRP holds the static priority and the dynamic user priority which is calculated based on the behavior of the whole process. This causes all threads in the process to be penalized for using cpu at the same rate as a single threaded process using an equivalent amount of cpu would be. The effects are less because each thread/kse is given as big of a quantum as each full process would. I'm not sure if this is a bug or a feature. In my opnion the ksegrp is not totally hashed out. I think you may forget that I have done a fair amount of work on schedulers in freebsd and I do understand the ramification of the decision that I made. I do not think this at all important to have correct prior to having real users using real threads. Cheers, Jeff From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 12:09:53 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4AD1537B401 for ; Thu, 27 Mar 2003 12:09:53 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 98B8843FA3 for ; Thu, 27 Mar 2003 12:09:52 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RK9nBg026712; Thu, 27 Mar 2003 15:09:49 -0500 (EST) Received: from localhost (eischen@localhost)h2RK9m7h026709; Thu, 27 Mar 2003 15:09:48 -0500 (EST) Date: Thu, 27 Mar 2003 15:09:48 -0500 (EST) From: Daniel Eischen To: Jeff Roberson In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 20:09:54 -0000 On Thu, 27 Mar 2003, Jeff Roberson wrote: > On Thu, 27 Mar 2003, Daniel Eischen wrote: > > > On Thu, 27 Mar 2003, Scott Long wrote: > > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE > > > development. By keeping the 1:1 and M:N API's separate, KSE can > > > progress in 6-CURRENT until it is proven while still allowing MFC's to > > > 5-STABLE to happen without too much pain. > > > > That's kind of silly; we have other ways to keep API/ABI > > compatability and have used this for all other syscalls. > > The KSE and thread mailboxes even have version numbers > > in them. > > Which means they are likely to change. I do not want to develop on > unstable APIs and unstable kernel code. kern_thr.c is 254 lines. I think > we can handle a little duplication. I'm not sure why the objection is so > strong. I don't see kse_create() changing since it takes a mailbox pointer as an argument and you can theoretically hang anything off the [versioned] mailbox. > > > Later on down the road when > > > KSE matures, or when we decide that 1:1 should really just be a special > > > case of M:N, we can look at addressing the above concerns and possibly > > > MFC'ing the results back to 5-STABLE. But for now we need to allow for > > > 5-STABLE to actually be usable and maintainable. > > > > The libthr implementation of 1:1 is not what most consider > > 1:1 -- you don't get a separate quantum and priority for > > each thread. As such, this library is really no different > > than libkse. The only real difference is that the UTS > > chooses the next thread to run instead of the kernel. > > If you're going to add a bunch of code to both userland > > (in libthr) and the kernel just to get a working threading > > library, it seems much easier to just fix libkse so that > > it works for the single KSE/KSEG case. > > It didn't seem much easier to me. For the single KSE/KSEG case it's almost there. There are just a couple of issues involving signals and some bugs. It's basically libc_r with the UTS swapped out for a KSE-one. I haven't spent any time on it because I wanted to come at it from a different angle; rewriting it with KSE/KSEGs in mind instead of just porting it. > This whole argument about kseg/kse/thread vs kse/thread can be solved very > easily by allocating a ksegrp in kern_thr.c I estimate that would add > another 10 lines of code. > > The ksegrp argument is questionable anyway. In both ULE and 4bds each KSE > gets its own quantum. The KSEGRP holds the static priority and the > dynamic user priority which is calculated based on the behavior of the > whole process. This causes all threads in the process to be penalized for > using cpu at the same rate as a single threaded process using an > equivalent amount of cpu would be. That wasn't my understanding of how KSE's were suppose to work. The orignal idea was that the quantum and priorities were suppose to be in the KSE Group. Yes, two KSEs could get scheduled simulataneously on different CPUs and consume 2 quantums, but the KSE Group would get charged for both causing them to run less often. Or something like that. In effect, over time 2 KSEs in a group would get no more processor time than a non-KSEd process (all other things being equal). I originally argued that it didn't make sense to have both a KSE group and a KSE; that they could be one and the same. I lost the argument :-) -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 12:21:35 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4CC6337B401; Thu, 27 Mar 2003 12:21:35 -0800 (PST) Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id B522E43F93; Thu, 27 Mar 2003 12:21:34 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: from apollo.backplane.com (localhost [127.0.0.1]) by apollo.backplane.com (8.12.8/8.12.6) with ESMTP id h2RKLY31049841; Thu, 27 Mar 2003 12:21:34 -0800 (PST) (envelope-from dillon@apollo.backplane.com) Received: (from dillon@localhost) by apollo.backplane.com (8.12.8/8.12.6/Submit) id h2RKLYo8049840; Thu, 27 Mar 2003 12:21:34 -0800 (PST) Date: Thu, 27 Mar 2003 12:21:34 -0800 (PST) From: Matthew Dillon Message-Id: <200303272021.h2RKLYo8049840@apollo.backplane.com> To: Nate Lawson References: X-Spam-Status: No, hits=-7.1 required=5.0 tests=AWL,REFERENCES version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: current@freebsd.org Subject: Re: 5.x locking plan X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 20:21:37 -0000 :My curiousity has overcome my fear of the bikeshed so I'll ask the :question that has been bugging me for a while. Why haven't we gone :through the tree and created a lock for each spl and then converted every :spl call into the appropriate mtx_lock call? At that point, we can mark :large sections of the tree giant-free and then make the locking data-based :(instead of code-based) one section at a time. This is the approach :Solaris took. : :-Nate The problem is that SPLs are per-thread masks, and different sets of bits can be added or removed from the master mask in any order and at any time. There is no direct translation to a mutex (which cannot be obtained in random order, is not per-thread, and may result in preemption or a context switch). Most of the code locked under Giant assumes the single-threading of kernel threads regardless of the SPL. This 'inherent' single threading is one the reasons why the original code was so efficient. Since preemption can occur now under many new circumstances, including when 'normal' (non-spin) mutexes are used to replace prior uses of SPLs (which could not cause thread level preemption)... well, it basically means there is no easy way to remove Giant short of going through every bit of code and fixing it one subsystem at a time. Giant itself is a special case. It is not a normal mutex. Instead, the kernel very carefully saves and restores the state of Giant on a per-thread basis so programs don't 'need to know' whether Giant is being held or not and so Giant can be held in combination with another mutex without violating the basic 'only one mutex can be held when going to sleep' rule. -Matt Matthew Dillon From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 14:49:56 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8A00237B401 for ; Thu, 27 Mar 2003 14:49:56 -0800 (PST) Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4FD5843F75 for ; Thu, 27 Mar 2003 14:49:55 -0800 (PST) (envelope-from gallatin@cs.duke.edu) Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu [152.3.145.30]) by duke.cs.duke.edu (8.12.8/8.12.8) with ESMTP id h2RMnsRv011567 (version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO); Thu, 27 Mar 2003 17:49:54 -0500 (EST) Received: (from gallatin@localhost) by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h2RMnnY18612; Thu, 27 Mar 2003 17:49:49 -0500 (EST) (envelope-from gallatin@cs.duke.edu) From: Andrew Gallatin MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <16003.32780.950519.931661@grasshopper.cs.duke.edu> Date: Thu, 27 Mar 2003 17:49:48 -0500 (EST) To: Daniel Eischen In-Reply-To: References: <20030327143259.I64602-100000@mail.chesapeake.net> X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid X-Spam-Status: No, hits=-22.9 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list Reply-To: arch@freebsd.org List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 22:49:57 -0000 Daniel Eischen writes: > On Thu, 27 Mar 2003, Jeff Roberson wrote: > > > On Thu, 27 Mar 2003, Daniel Eischen wrote: > > > > Which means they are likely to change. I do not want to develop on > > unstable APIs and unstable kernel code. kern_thr.c is 254 lines. I think > > we can handle a little duplication. I'm not sure why the objection is so > > strong. > > I don't see kse_create() changing since it takes a > mailbox pointer as an argument and you can theoretically > hang anything off the [versioned] mailbox. According to the 5-stable roadmap at http://www.freebsd.org/doc/en/articles/5-roadmap/major-issues.html KSE kernel and userland components must be functionality complete by June 2003 in order to be included in the RELENG_5 branch. For security and stability reasons, if KSE cannot be finished in time then, by default, all KSE-specific syscalls should be modified to return ENOSYS and all other KSE-specific interfaces disabled. By not depending on KSE infastructure, the 1:1 can still be available in 5.1 in exactly the same fore regardless of whether or not KSE makes the June deadline or not. Drew From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 15:19:46 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3B92E37B401 for ; Thu, 27 Mar 2003 15:19:46 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7D97E4400E for ; Thu, 27 Mar 2003 15:19:45 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RNJiBg024884 for ; Thu, 27 Mar 2003 18:19:44 -0500 (EST) Received: from localhost (eischen@localhost)h2RNJiha024881 for ; Thu, 27 Mar 2003 18:19:44 -0500 (EST) Date: Thu, 27 Mar 2003 18:19:44 -0500 (EST) From: Daniel Eischen To: arch@freebsd.org In-Reply-To: <16003.32780.950519.931661@grasshopper.cs.duke.edu> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.7 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 23:19:47 -0000 On Thu, 27 Mar 2003, Andrew Gallatin wrote: > > Daniel Eischen writes: > > On Thu, 27 Mar 2003, Jeff Roberson wrote: > > > > > On Thu, 27 Mar 2003, Daniel Eischen wrote: > > > > > > Which means they are likely to change. I do not want to develop on > > > unstable APIs and unstable kernel code. kern_thr.c is 254 lines. I think > > > we can handle a little duplication. I'm not sure why the objection is so > > > strong. > > > > I don't see kse_create() changing since it takes a > > mailbox pointer as an argument and you can theoretically > > hang anything off the [versioned] mailbox. > > According to the 5-stable roadmap at > http://www.freebsd.org/doc/en/articles/5-roadmap/major-issues.html > > KSE kernel and userland components must be functionality complete > by June 2003 in order to be included in the RELENG_5 branch. For > security and stability reasons, if KSE cannot be finished in time > then, by default, all KSE-specific syscalls should be modified to > return ENOSYS and all other KSE-specific interfaces disabled. This sounds like an argument to use the KSE syscalls :-) If libthr is based on KSE and it works, then you've accomplished the above. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 15:49:05 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3E8A237B401 for ; Thu, 27 Mar 2003 15:49:05 -0800 (PST) Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9ACCD43FBD for ; Thu, 27 Mar 2003 15:49:04 -0800 (PST) (envelope-from scott_long@btc.adaptec.com) Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11]) by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h2RNm6l00449; Thu, 27 Mar 2003 15:48:06 -0800 Received: from btc.btc.adaptec.com (btc.btc.adaptec.com [10.100.0.52]) by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id PAA25978; Thu, 27 Mar 2003 15:48:57 -0800 (PST) Received: from btc.adaptec.com (hollin [10.100.253.56]) by btc.btc.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id QAA09390; Thu, 27 Mar 2003 16:48:54 -0700 (MST) Message-ID: <3E838D57.4050305@btc.adaptec.com> Date: Thu, 27 Mar 2003 16:46:31 -0700 From: Scott Long User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2b) Gecko/20021216 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Daniel Eischen References: In-Reply-To: Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-31.9 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Mar 2003 23:49:06 -0000 Daniel Eischen wrote: > On Thu, 27 Mar 2003, Andrew Gallatin wrote: > > > >Daniel Eischen writes: > > > On Thu, 27 Mar 2003, Jeff Roberson wrote: > > > > > > > On Thu, 27 Mar 2003, Daniel Eischen wrote: > > > > > > > > Which means they are likely to change. I do not want to develop on > > > > unstable APIs and unstable kernel code. kern_thr.c is 254 > lines. I think > > > > we can handle a little duplication. I'm not sure why the > objection is so > > > > strong. > > > > > > I don't see kse_create() changing since it takes a > > > mailbox pointer as an argument and you can theoretically > > > hang anything off the [versioned] mailbox. > > > >According to the 5-stable roadmap at > > http://www.freebsd.org/doc/en/articles/5-roadmap/major-issues.html > > > > KSE kernel and userland components must be functionality complete > > by June 2003 in order to be included in the RELENG_5 branch. For > > security and stability reasons, if KSE cannot be finished in time > > then, by default, all KSE-specific syscalls should be modified to > > return ENOSYS and all other KSE-specific interfaces disabled. > > > This sounds like an argument to use the KSE syscalls :-) > If libthr is based on KSE and it works, then you've accomplished > the above. > The 5-stable roadmap document was written before, and without any knowledge of, Jeff's work. The purpose of the above paragraph was to define a deadline for the threading work to be done. With the advent of libthr, there is no longer pressure for the KSE kernel and userland components to be complete for RELENG_5, so the quoted paragraph can be relaxed a little bit. I'm not sure if I would feel comfortable shipping a release with the KSE syscalls turned on but no libkse to interact with them, but that can be discussed further. The bigger picture is that libthr is at the point now that I wanted libkse to be at in 3 months. Some may be grumpy and feel that libthr has subverted libkse, however I'd like to remind everyone that Jon and Jeff were under no contractual obligation to work on libkse. Their work does, however, solve the pressing need for a working threading library for 5-STABLE. If someone wants to pick up the torch for libkse and finish it by the June deadline, I would be thrilled. Otherwise, I'm happy with libthr for 5-STABLE, and I encourage people to finish libkse and M:N for 6.0. As for the arguments of libthr creating new syscalls, I'll point out that KSE and libkse have not yet run any real-world applications, and therefore are hard to even consider as 'alpha' quality. It's foolish to assume that the KSE interfaces will not change as KSE matures. If libthr is tied to the current interface, it creates a maintenance nightmare as the interface changes, especially for the RELENG_5 branch. I realize that it also creates baggage once libkse is the default, but that can be solved by deprecating the libthr interfaces for 6.x and removing them for 7.x. It's a small price to pay for such a huge benefit. Scott From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 17:19:57 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A2F8737B407 for ; Thu, 27 Mar 2003 17:19:57 -0800 (PST) Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com [218.97.164.167]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9CC3343FA3 for ; Thu, 27 Mar 2003 17:19:55 -0800 (PST) (envelope-from davidxu@freebsd.org) Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id HLDQ3B6M; Fri, 28 Mar 2003 09:06:22 +0800 Message-ID: <005201c2f4c8$517da320$f001a8c0@davidw2k> From: "David Xu" To: "Jeff Roberson" , "Daniel Eischen" References: <20030327143259.I64602-100000@mail.chesapeake.net> Date: Fri, 28 Mar 2003 09:21:11 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4807.1700 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300 X-Spam-Status: No, hits=-9.0 required=5.0 tests=AWL,QUOTED_EMAIL_TEXT,REFERENCES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 01:19:59 -0000 ----- Original Message -----=20 From: "Jeff Roberson" To: "Daniel Eischen" Cc: ; "Scott Long" Sent: Friday, March 28, 2003 3:50 AM Subject: Re: 1:1 threading. > On Thu, 27 Mar 2003, Daniel Eischen wrote: >=20 > > On Thu, 27 Mar 2003, Scott Long wrote: > > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs = for KSE > > > development. By keeping the 1:1 and M:N API's separate, KSE can > > > progress in 6-CURRENT until it is proven while still allowing = MFC's to > > > 5-STABLE to happen without too much pain. > > > > That's kind of silly; we have other ways to keep API/ABI > > compatability and have used this for all other syscalls. > > The KSE and thread mailboxes even have version numbers > > in them. >=20 > Which means they are likely to change. I do not want to develop on > unstable APIs and unstable kernel code. kern_thr.c is 254 lines. I = think > we can handle a little duplication. I'm not sure why the objection is = so > strong. >=20 > > > > > Later on down the road when > > > KSE matures, or when we decide that 1:1 should really just be a = special > > > case of M:N, we can look at addressing the above concerns and = possibly > > > MFC'ing the results back to 5-STABLE. But for now we need to = allow for > > > 5-STABLE to actually be usable and maintainable. > > > > The libthr implementation of 1:1 is not what most consider > > 1:1 -- you don't get a separate quantum and priority for > > each thread. As such, this library is really no different > > than libkse. The only real difference is that the UTS > > chooses the next thread to run instead of the kernel. > > If you're going to add a bunch of code to both userland > > (in libthr) and the kernel just to get a working threading > > library, it seems much easier to just fix libkse so that > > it works for the single KSE/KSEG case. >=20 > It didn't seem much easier to me. >=20 > This whole argument about kseg/kse/thread vs kse/thread can be solved = very > easily by allocating a ksegrp in kern_thr.c I estimate that would add > another 10 lines of code. >=20 > The ksegrp argument is questionable anyway. In both ULE and 4bds each = KSE > gets its own quantum. The KSEGRP holds the static priority and the > dynamic user priority which is calculated based on the behavior of the > whole process. This causes all threads in the process to be penalized = for > using cpu at the same rate as a single threaded process using an > equivalent amount of cpu would be. >=20 > The effects are less because each thread/kse is given as big of a = quantum > as each full process would. I'm not sure if this is a bug or a = feature. >=20 > In my opnion the ksegrp is not totally hashed out. I think you may = forget > that I have done a fair amount of work on schedulers in freebsd and I = do > understand the ramification of the decision that I made. I do not = think > this at all important to have correct prior to having real users using > real threads. >=20 do you think that a multithreaded process should use more CPU time then a single thread process, so threaded process should have higher priority and block other single thread processes out? AFAIK, threading is not=20 designed for this, you may misunderstand what threading is designed for. > Cheers, > Jeff >=20 > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to = "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 18:12:17 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 393B137B404; Thu, 27 Mar 2003 18:12:17 -0800 (PST) Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162]) by mx1.FreeBSD.org (Postfix) with ESMTP id 20C5643FBF; Thu, 27 Mar 2003 18:12:15 -0800 (PST) (envelope-from wes@softweyr.com) Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329); Thu, 27 Mar 2003 18:12:14 -0800 From: Wes Peters Organization: Softweyr.com To: "Poul-Henning Kamp" , Marcel Moolenaar Date: Thu, 27 Mar 2003 18:12:13 -0800 User-Agent: KMail/1.5 References: <14594.1048582113@critter.freebsd.dk> In-Reply-To: <14594.1048582113@critter.freebsd.dk> X-Habeas-SWE-1: winter into spring X-Habeas-SWE-2: brightly anticipated X-Habeas-SWE-3: like Habeas SWE (tm) X-Habeas-SWE-4: Copyright 2002 Habeas (tm) X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this X-Habeas-SWE-6: email in exchange for a license for this Habeas X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this X-Habeas-SWE-9: mark in spam to . MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200303271812.13745.wes@softweyr.com> X-OriginalArrivalTime: 28 Mar 2003 02:12:14.0027 (UTC) FILETIME=[72BE31B0:01C2F4CF] X-Spam-Status: No, hits=-25.4 required=5.0 tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES, USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: David Schultz cc: freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 02:12:20 -0000 On Tuesday 25 March 2003 00:48, Poul-Henning Kamp wrote: > In message <20030325084247.GA17195@dhcp01.pn.xcllnt.net>, Marcel > Moolenaar writes: > >> To tackle them from behind: > >> > >> Wes has a proposal for #3 which is a per-process flag which says > >> "I'm sacred". I think that is a sound principle since that is > >> usually exactly what people want: Do Not Kill This Process. > >> > >> Certain processes already enjoy special protection, pid==1 most > >> notably, this would just be a way to make the same protection > >> available to other processes. I'm not happy about using the > >> resourcelimit code for booleans, and I don't think the flag > >> should be inherited, but otherwise I'm for the idea. > > > >JFYI: On ia64 there are 12 bits in the ELF header reserved for OS > >specific flags. A very natural way to flag a process as being sacred > >is by flagging the ELF executable. You could use brandelf for that. > > Many years ago, we had a local hack so you could specify the nice(2) > that a given program would be executed at (relative to the parent > process) in the a.out file. This allowed us to keep games open > during the day because we could argue that running at -20 they used > only resources not otherwise claimed. > > Other operating systems have much more expressive facilities for > putting attributes on a program. In some cases this is being held > stronly against them. You could easily implement this with an ELF executable by adding "note" section(s) containing the attributes in a format understood by your loader or linker. A hackup of brandelf could modify the binaries in well-specified ways. You could also do this with extended attributes on the executable/ library files. > I think, but am not sure, that we can now introduce practically any > policy we might like with MAC. (NB: deliberate rwatson-trigger) > > How the flags/attributes gets to be set on the wanted subset of > processes is by no means uninteresting, but until something pays > attention to the flag... Working on it. -- "Where am I, and what am I doing in this handbasket?" Wes Peters wes@softweyr.com From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 22:24:57 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B1FC237B408 for ; Thu, 27 Mar 2003 22:24:57 -0800 (PST) Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0181543F75 for ; Thu, 27 Mar 2003 22:24:57 -0800 (PST) (envelope-from julian@elischer.org) Received: from interjet.elischer.org (12-232-168-4.client.attbi.com[12.232.168.4]) by sccrmhc02.attbi.com (sccrmhc02) with ESMTP id <2003032806245500200jc503e>; Fri, 28 Mar 2003 06:24:56 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id WAA65920; Thu, 27 Mar 2003 22:24:54 -0800 (PST) Date: Thu, 27 Mar 2003 22:24:51 -0800 (PST) From: Julian Elischer To: Jeff Roberson In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.0 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REPLY_WITH_QUOTES, USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long cc: Daniel Eischen Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 06:25:00 -0000 On Thu, 27 Mar 2003, Jeff Roberson wrote: > > The effects are less because each thread/kse is given as big of a quantum > as each full process would. I'm not sure if this is a bug or a feature. > It's neither.. it's not what happens. More accuratly, it's only part of the story. Firstly it's 'standin' code.. but it exhibis some of the desired bahaviour. Yes, each KSE gets a quantum, but the next thread to run in that KSE is forced to go to the end of the queue. Effecively, this forces the process to allow other processes with enough priority to get CPU. This is a 'quick' solution to stopping a process with a lot of threads from swamping the system. The plan is to put in place a more comprehensive solution when time allows. Only one thread for each KSE can actually be on the run queue at a time.. yes it could run for the entire quantum, but the system will not let it put 10000 of these back to back unless there are no other competing threads. There is room there for a graduate student to do a project adding code to allow the KSEGRP to allow the rest of it's quantum to be passed on to other equivalent priority threads in the same group. This would be, as dicussed several times, in the form of code in choosethread() that would first check for internal threads if there was quantum left, before resorting to external threads to run. (The trick is to get the right balance) From owner-freebsd-arch@FreeBSD.ORG Thu Mar 27 22:29:02 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B72B737B401 for ; Thu, 27 Mar 2003 22:29:02 -0800 (PST) Received: from park.rambler.ru (park.rambler.ru [81.19.64.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id E16B543F85 for ; Thu, 27 Mar 2003 22:29:00 -0800 (PST) (envelope-from is@rambler-co.ru) Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102]) by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h2S6SsmF022454; Fri, 28 Mar 2003 09:28:54 +0300 (MSK) Date: Fri, 28 Mar 2003 09:28:54 +0300 (MSK) From: Igor Sysoev X-Sender: is@is To: freebsd-arch@freebsd.org In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-20.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 06:29:03 -0000 On Thu, 27 Mar 2003, Jeff Roberson wrote: >The ksegrp argument is questionable anyway. In both ULE and 4bds each KSE >gets its own quantum. The KSEGRP holds the static priority and the >dynamic user priority which is calculated based on the behavior of the >whole process. This causes all threads in the process to be penalized for >using cpu at the same rate as a single threaded process using an >equivalent amount of cpu would be. Why should multi-threaded process get more CPU time then single threaded if they both have the same base priority ? CPU time should be given based on a process priority not a number of its threads. >The effects are less because each thread/kse is given as big of a quantum >as each full process would. I'm not sure if this is a bug or a feature. It's not a bug or a feature. It's the right thing. >In my opnion the ksegrp is not totally hashed out. I think you may forget >that I have done a fair amount of work on schedulers in freebsd and I do >understand the ramification of the decision that I made. I do not think >this at all important to have correct prior to having real users using >real threads. As I understand KSEGRP was designed with M:N model in mind. If you have M threads mapped to N KSEs then all these KSEs should have the same priority. The second KSEGRP capability is to limit a number of KSEs to a number of CPUs. It's usefull for M:N model because KSE is almost never (I believe) blocked and always ready to run (if not parked). For 1:1 model KSEGRP is not theoreticaly needed because you can set priority (theoreticaly) directly in KSE and you do not need to limit a number of KSEs to a number of CPUs. If the thread blocks then its KSE blocks too. But I think for design completeness you should use KSEGRP to store KSE's priority in 1:1 model. Igor Syseov http://sysoev.ru/en/ From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 03:20:48 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 24F1537B401; Fri, 28 Mar 2003 03:20:47 -0800 (PST) Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5E96743F3F; Fri, 28 Mar 2003 03:20:44 -0800 (PST) (envelope-from dcs@tcoip.com.br) Received: from tcoip.com.br ([10.0.2.6]) by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2SBKf906269; Fri, 28 Mar 2003 08:20:41 -0300 Message-ID: <3E843009.2060104@tcoip.com.br> Date: Fri, 28 Mar 2003 08:20:41 -0300 From: "Daniel C. Sobral" User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326 X-Accept-Language: en-us, en, pt-br, ja MIME-Version: 1.0 To: David Xu References: <20030327143259.I64602-100000@mail.chesapeake.net> <005201c2f4c8$517da320$f001a8c0@davidw2k> In-Reply-To: <005201c2f4c8$517da320$f001a8c0@davidw2k> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, hits=-28.6 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 11:20:49 -0000 David Xu wrote: > > do you think that a multithreaded process should use more CPU time then > a single thread process, so threaded process should have higher priority > and block other single thread processes out? AFAIK, threading is not > designed for this, you may misunderstand what threading is designed for. Threading might not have been originally designed for this, but a lot of people use it this way, a lot of people *want* it this way, and POSIX specifically mandates that this way be available. So let's drop that issue, please. -- Daniel C. Sobral (8-DCS) Gerencia de Operacoes Divisao de Comunicacao de Dados Coordenacao de Seguranca TCO Fones: 55-61-313-7654/Cel: 55-61-9618-0904 E-mail: Daniel.Capo@tco.net.br Daniel.Sobral@tcoip.com.br dcs@tcoip.com.br Outros: dcs@newsguy.com dcs@freebsd.org capo@notorious.bsdconspiracy.net After an instrument has been assembled, extra components will be found on the bench. From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 09:10:38 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9C99937B401; Fri, 28 Mar 2003 09:10:38 -0800 (PST) Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27]) by mx1.FreeBSD.org (Postfix) with ESMTP id D15DE43F93; Fri, 28 Mar 2003 09:10:37 -0800 (PST) (envelope-from wes@softweyr.com) Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204]) by smtp-relay.omnis.com (Postfix) with ESMTP id 0C6A843B68; Fri, 28 Mar 2003 09:10:36 -0800 (PST) From: Wes Peters Organization: Softweyr To: "Poul-Henning Kamp" , David Schultz Date: Fri, 28 Mar 2003 09:10:32 -0800 User-Agent: KMail/1.5 References: <14382.1048580753@critter.freebsd.dk> In-Reply-To: <14382.1048580753@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: Multipart/Mixed; boundary="Boundary-00=_IIIh+WfRJfXpokV" Message-Id: <200303280910.32307.wes@softweyr.com> X-Spam-Status: No, hits=-29.1 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,PATCH_UNIFIED_DIFF, QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) X-Content-Filtered-By: Mailman/MimeDel 2.1.1 cc: freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 17:11:09 -0000 --Boundary-00=_IIIh+WfRJfXpokV Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit Content-Disposition: inline On Tuesday 25 March 2003 00:25, Poul-Henning Kamp wrote: > > As I see it, there is a need for several mechanisms: > > 1. A mechanism to export to userland enough information about the > current RAM availability, so that phkmalloc and application > specific code can make intelligent choices before things go bad. > > 2. A mechanism to alert userland to the fact that things _have_ gone > bad. > > 3. A mechanism to influence the "Who do we kill ?" decision once > things have gone from bad to worse. > > To tackle them from behind: > > Wes has a proposal for #3 which is a per-process flag which says > "I'm sacred". I think that is a sound principle since that is > usually exactly what people want: Do Not Kill This Process. > > Certain processes already enjoy special protection, pid==1 most > notably, this would just be a way to make the same protection > available to other processes. I'm not happy about using the > resourcelimit code for booleans, and I don't think the flag > should be inherited, but otherwise I'm for the idea. I've reworked my patch to use the madvise(2) syscall, like the original 4.x patch did. I've even documented it, in a man page of all places. Please see attached patch. If nobody objects, I'll commit sometime this weekend. > We have the SIGDANGER proposal for #2, but I think we need to have > two severities: "Out of RAM" and "Out of VM". A program like > fsck would start to recycle cached sectors once we're out of RAM. I'll work with Garance to create a proposal, some pseudocode, something like a design. Then we can bikeshed that. Mike Murphy is helping silently at work, letting me bounce ideas off him and look at the man pages on his AIX machine. > But I have not seen anybody come up with a good proposal for > #1, and that is where the main benefit would be derived: It would > allow processes to be good citizens and adjust to the present > situation. Added to think-about queue... -- Where am I, and what am I doing in this handbasket? Wes Peters wes@softweyr.com --Boundary-00=_IIIh+WfRJfXpokV-- From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 11:28:23 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 400C437B408; Fri, 28 Mar 2003 11:28:23 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 731C243F85; Fri, 28 Mar 2003 11:28:22 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2SJSIBg006924; Fri, 28 Mar 2003 14:28:18 -0500 (EST) Received: from localhost (eischen@localhost)h2SJSHk2006921; Fri, 28 Mar 2003 14:28:17 -0500 (EST) Date: Fri, 28 Mar 2003 14:28:17 -0500 (EST) From: Daniel Eischen To: "Daniel C. Sobral" In-Reply-To: <3E843009.2060104@tcoip.com.br> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 19:28:26 -0000 On Fri, 28 Mar 2003, Daniel C. Sobral wrote: > David Xu wrote: > > > > do you think that a multithreaded process should use more CPU time then > > a single thread process, so threaded process should have higher priority > > and block other single thread processes out? AFAIK, threading is not > > designed for this, you may misunderstand what threading is designed for. > > Threading might not have been originally designed for this, but a lot of > people use it this way, a lot of people *want* it this way, and POSIX > specifically mandates that this way be available. It is available through pthread_attr_setscope(). There's some confusion over this and the way libthr is implemented. KSE's within the same KSE Group were not designed to give more CPU time than a normal unthreaded/single KSE'd process. Unless this has been changed in the kernel somehow, the use of multiple KSEs by libthr or libkse (in a single KSEG) will not get any more CPU time than a non-threaded program. There was some debate over this, but multiple KSEs within a KSEG were _not_ suppose to allow this. You are suppose to create a new KSEG in order to get this behavior. -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 12:22:51 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5525D37B401 for ; Fri, 28 Mar 2003 12:22:51 -0800 (PST) Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6170B43F93 for ; Fri, 28 Mar 2003 12:22:50 -0800 (PST) (envelope-from jroberson@chesapeake.net) Received: from localhost (jroberson@localhost) by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2SKMh235468; Fri, 28 Mar 2003 15:22:43 -0500 (EST) (envelope-from jroberson@chesapeake.net) Date: Fri, 28 Mar 2003 15:22:43 -0500 (EST) From: Jeff Roberson To: Daniel Eischen In-Reply-To: Message-ID: <20030328151526.S64602-100000@mail.chesapeake.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-17.1 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 20:22:52 -0000 On Fri, 28 Mar 2003, Daniel Eischen wrote: > On Fri, 28 Mar 2003, Daniel C. Sobral wrote: > > > David Xu wrote: > > > > > > do you think that a multithreaded process should use more CPU time then > > > a single thread process, so threaded process should have higher priority > > > and block other single thread processes out? AFAIK, threading is not > > > designed for this, you may misunderstand what threading is designed for. > > > > Threading might not have been originally designed for this, but a lot of > > people use it this way, a lot of people *want* it this way, and POSIX > > specifically mandates that this way be available. > > It is available through pthread_attr_setscope(). > > There's some confusion over this and the way libthr is implemented. > KSE's within the same KSE Group were not designed to give more CPU > time than a normal unthreaded/single KSE'd process. Unless this > has been changed in the kernel somehow, the use of multiple KSEs > by libthr or libkse (in a single KSEG) will not get any more CPU > time than a non-threaded program. There was some debate over > this, but multiple KSEs within a KSEG were _not_ suppose to allow > this. You are suppose to create a new KSEG in order to get > this behavior. > This is not how it is implemented in either scheduler that we currently have. I'm not saying which way is more or less correct because I think you could argue either way. We can not entirely correctly implement SCOPE_PROCESSES threads right now anyway. This being said.. It is a property of the thr system calls and not libthr. I have a flags field in thr_create() that could be used to indicate which scope the thread should contend in. Cheers, Jeff From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 12:34:49 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id CED4837B404 for ; Fri, 28 Mar 2003 12:34:49 -0800 (PST) Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1A81B43FCB for ; Fri, 28 Mar 2003 12:34:49 -0800 (PST) (envelope-from eischen@pcnet1.pcnet.com) Received: from pcnet1.pcnet.com (localhost [127.0.0.1]) by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2SKYiBg018345; Fri, 28 Mar 2003 15:34:44 -0500 (EST) Received: from localhost (eischen@localhost)h2SKYiKC018341; Fri, 28 Mar 2003 15:34:44 -0500 (EST) Date: Fri, 28 Mar 2003 15:34:44 -0500 (EST) From: Daniel Eischen To: Jeff Roberson In-Reply-To: <20030328151526.S64602-100000@mail.chesapeake.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.4 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, REPLY_WITH_QUOTES,USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 20:34:51 -0000 On Fri, 28 Mar 2003, Jeff Roberson wrote: > On Fri, 28 Mar 2003, Daniel Eischen wrote: > > > On Fri, 28 Mar 2003, Daniel C. Sobral wrote: > > > > > David Xu wrote: > > > > > > > > do you think that a multithreaded process should use more CPU time then > > > > a single thread process, so threaded process should have higher priority > > > > and block other single thread processes out? AFAIK, threading is not > > > > designed for this, you may misunderstand what threading is designed for. > > > > > > Threading might not have been originally designed for this, but a lot of > > > people use it this way, a lot of people *want* it this way, and POSIX > > > specifically mandates that this way be available. > > > > It is available through pthread_attr_setscope(). > > > > There's some confusion over this and the way libthr is implemented. > > KSE's within the same KSE Group were not designed to give more CPU > > time than a normal unthreaded/single KSE'd process. Unless this > > has been changed in the kernel somehow, the use of multiple KSEs > > by libthr or libkse (in a single KSEG) will not get any more CPU > > time than a non-threaded program. There was some debate over > > this, but multiple KSEs within a KSEG were _not_ suppose to allow > > this. You are suppose to create a new KSEG in order to get > > this behavior. > > > > This is not how it is implemented in either scheduler that we currently > have. I'm not saying which way is more or less correct because I think > you could argue either way. We can not entirely correctly implement > SCOPE_PROCESSES threads right now anyway. Well, since we have KSEGs, I'd argue that this is a bug. Perhaps it was too difficult to do this and no-one thought you'd ever allow more KSEs in a KSEG than you have CPUs, so that became the limiting factor. > This being said.. It is a property of the thr system calls and not > libthr. I have a flags field in thr_create() that could be used to > indicate which scope the thread should contend in. BTW, I'm not arguing about libthr implementation here. I'm just stating what a KSE is (was) suppose to be (which implicitly describes libthr and libkse behavior). -- Dan Eischen From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 12:57:15 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A78FD37B401 for ; Fri, 28 Mar 2003 12:57:15 -0800 (PST) Received: from sccrmhc01.attbi.com (sccrmhc01.attbi.com [204.127.202.61]) by mx1.FreeBSD.org (Postfix) with ESMTP id E1C3B43FCB for ; Fri, 28 Mar 2003 12:57:14 -0800 (PST) (envelope-from julian@elischer.org) Received: from interjet.elischer.org (12-232-168-4.client.attbi.com[12.232.168.4]) by sccrmhc01.attbi.com (sccrmhc01) with ESMTP id <2003032820571300100cu49te>; Fri, 28 Mar 2003 20:57:14 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA71665; Fri, 28 Mar 2003 12:57:12 -0800 (PST) Date: Fri, 28 Mar 2003 12:57:10 -0800 (PST) From: Julian Elischer To: Daniel Eischen In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Spam-Status: No, hits=-25.0 required=5.0 tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT, QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REPLY_WITH_QUOTES, USER_AGENT_PINE autolearn=ham version=2.50 X-Spam-Level: X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp) cc: Scott Long cc: arch@freebsd.org Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 28 Mar 2003 20:57:17 -0000 On Fri, 28 Mar 2003, Daniel Eischen wrote: > On Fri, 28 Mar 2003, Jeff Roberson wrote: > > > On Fri, 28 Mar 2003, Daniel Eischen wrote: > > > > > On Fri, 28 Mar 2003, Daniel C. Sobral wrote: > > > > > > > David Xu wrote: > > > > > > > > > > do you think that a multithreaded process should use more CPU time then > > > > > a single thread process, so threaded process should have higher priority > > > > > and block other single thread processes out? AFAIK, threading is not > > > > > designed for this, you may misunderstand what threading is designed for. > > > > > > > > Threading might not have been originally designed for this, but a lot of > > > > people use it this way, a lot of people *want* it this way, and POSIX > > > > specifically mandates that this way be available. > > > > > > It is available through pthread_attr_setscope(). > > > > > > There's some confusion over this and the way libthr is implemented. > > > KSE's within the same KSE Group were not designed to give more CPU > > > time than a normal unthreaded/single KSE'd process. Unless this > > > has been changed in the kernel somehow, the use of multiple KSEs > > > by libthr or libkse (in a single KSEG) will not get any more CPU > > > time than a non-threaded program. There was some debate over > > > this, but multiple KSEs within a KSEG were _not_ suppose to allow > > > this. You are suppose to create a new KSEG in order to get > > > this behavior. > > > > > > > This is not how it is implemented in either scheduler that we currently > > have. I'm not saying which way is more or less correct because I think > > you could argue either way. We can not entirely correctly implement > > SCOPE_PROCESSES threads right now anyway. > > Well, since we have KSEGs, I'd argue that this is a bug. > Perhaps it was too difficult to do this and no-one thought > you'd ever allow more KSEs in a KSEG than you have CPUs, > so that became the limiting factor. > > > This being said.. It is a property of the thr system calls and not > > libthr. I have a flags field in thr_create() that could be used to > > indicate which scope the thread should contend in. > > BTW, I'm not arguing about libthr implementation here. I'm > just stating what a KSE is (was) suppose to be (which implicitly > describes libthr and libkse behavior). I'm happy to see the limit of (NKSEs !> NCPU) lifted for processes that are in some way identified as 1:1 mode processes.. I don't want to lift it for KSE mode processes however. For system scope threads, I guess you just allocate a separate KSEGRP so it has somewhere to store pertinent info. that makes it rather simple system scope threads have a thread, a KSE and a KSEGRP process scope threads just use the existing KSEGRP. Everythiong should just "fall out correctly" by doing this.. > > -- > Dan Eischen > > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" > From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 20:26:49 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9847E37B401 for ; Fri, 28 Mar 2003 20:26:49 -0800 (PST) Received: from canning.wemm.org (canning.wemm.org [192.203.228.65]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4166643FA3 for ; Fri, 28 Mar 2003 20:26:49 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by canning.wemm.org (Postfix) with ESMTP id 18B682A8BB; Fri, 28 Mar 2003 20:26:49 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Jeff Roberson In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net> Date: Fri, 28 Mar 2003 20:26:49 -0800 From: Peter Wemm Message-Id: <20030329042649.18B682A8BB@canning.wemm.org> cc: arch@freebsd.org cc: Scott Long cc: Daniel Eischen Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 04:26:51 -0000 Jeff Roberson wrote: > On Thu, 27 Mar 2003, Daniel Eischen wrote: > > > On Thu, 27 Mar 2003, Scott Long wrote: > > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE > > > development. By keeping the 1:1 and M:N API's separate, KSE can > > > progress in 6-CURRENT until it is proven while still allowing MFC's to > > > 5-STABLE to happen without too much pain. > > > > That's kind of silly; we have other ways to keep API/ABI > > compatability and have used this for all other syscalls. > > The KSE and thread mailboxes even have version numbers > > in them. > > Which means they are likely to change. I do not want to develop on > unstable APIs and unstable kernel code. kern_thr.c is 254 lines. I think > we can handle a little duplication. I'm not sure why the objection is so > strong. I for one think they should use seperate syscalls. We shouldn't have designed-for-KSE mailboxes going anywhere near this stuff and it gives the KSE folks plenty of room to keep tweaking their data structures. Anyway, I can't wait to see how this works out. It is becoming a Big Deal at work, we're using the linuxthreads port + rfork() out of desperation. libthr can't possibly be any nastier than that. Cheers, -Peter -- Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com "All of this is for nothing if we don't go to the stars" - JMS/B5 From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 20:46:31 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B07D937B401; Fri, 28 Mar 2003 20:46:31 -0800 (PST) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id EC94443F3F; Fri, 28 Mar 2003 20:46:30 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0191.cvx21-bradley.dialup.earthlink.net ([209.179.192.191] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18z8ET-0003HI-00; Fri, 28 Mar 2003 20:46:18 -0800 Message-ID: <3E8524C0.5F80D3D@mindspring.com> Date: Fri, 28 Mar 2003 20:44:48 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Daniel C. Sobral" References: <20030327143259.I64602-100000@mail.chesapeake.net> <3E843009.2060104@tcoip.com.br> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4d6b804bc6589c5b91231a495bc3d307ba2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c cc: arch@freebsd.org cc: Scott Long Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 04:46:33 -0000 "Daniel C. Sobral" wrote: > David Xu wrote: > > do you think that a multithreaded process should use more CPU time then > > a single thread process, so threaded process should have higher priority > > and block other single thread processes out? AFAIK, threading is not > > designed for this, you may misunderstand what threading is designed for. > > Threading might not have been originally designed for this, but a lot of > people use it this way, a lot of people *want* it this way, and POSIX > specifically mandates that this way be available. > > So let's drop that issue, please. A side question... Is there an administrative limit on the number of threads that you can create in a process, such that the total number is limited to the number of processes you are administratively limited to creating? I.e., the administrative limit on number of child processes is implicitly an administrative limit on how much quantum you can use; is the limit still enforced on threads, as well? -- Terry From owner-freebsd-arch@FreeBSD.ORG Fri Mar 28 21:11:46 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0FFE837B401 for ; Fri, 28 Mar 2003 21:11:46 -0800 (PST) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id 869A643F93 for ; Fri, 28 Mar 2003 21:11:45 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0191.cvx21-bradley.dialup.earthlink.net ([209.179.192.191] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18z8d0-0005nc-00; Fri, 28 Mar 2003 21:11:39 -0800 Message-ID: <3E852ABD.E77EA566@mindspring.com> Date: Fri, 28 Mar 2003 21:10:21 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Julian Elischer References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a46d4fb99f91a96aa48d9b9983a5122a09a7ce0e8f8d31aa3f350badd9bab72f9c350badd9bab72f9c cc: arch@freebsd.org cc: Scott Long cc: Daniel Eischen Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 05:11:47 -0000 Julian Elischer wrote: > I'm happy to see the limit of (NKSEs !> NCPU) lifted for processes that > are in some way identified as 1:1 mode processes.. > I don't want to lift it for KSE mode processes however. > > For system scope threads, I guess you just allocate a separate KSEGRP > so it has somewhere to store pertinent info. > > that makes it rather simple > system scope threads have a thread, a KSE and a KSEGRP > process scope threads just use the existing KSEGRP. > > Everythiong should just "fall out correctly" by doing this.. Except that means for process scope threads, you don't get SMP scalability, since the single KSEGRP binds them all to a single CPU... right? -- Terry From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 08:50:51 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 173D137B401 for ; Sat, 29 Mar 2003 08:50:51 -0800 (PST) Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8426143F85 for ; Sat, 29 Mar 2003 08:50:50 -0800 (PST) (envelope-from des@ofug.org) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 0E8EC5308; Sat, 29 Mar 2003 17:50:48 +0100 (CET) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: arch@freebsd.org From: des@ofug.org (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=) Date: Sat, 29 Mar 2003 17:50:46 +0100 Message-ID: User-Agent: Gnus/5.090015 (Oort Gnus v0.15) Emacs/21.2 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" Subject: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 16:50:54 -0000 --=-=-= Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable The attached patch, inspired by a discussion on -STABLE, modifies our resolver library to allow underscores in host names, by classifying the underscore as a hyphen character. Even though RFC952 forbids them, underscores are becoming increasingly common in DNS, and they are sometimes used for mechanisms (such as Microsoft's automatic proxy configuration scheme) which we might want to support in FreeBSD. DES --=20 Dag-Erling Sm=F8rgrav - des@ofug.org --=-=-= Content-Type: text/x-patch Content-Disposition: attachment; filename=hnok.diff Index: lib/libc/net/res_comp.c =================================================================== RCS file: /home/ncvs/src/lib/libc/net/res_comp.c,v retrieving revision 1.17 diff -u -r1.17 res_comp.c --- lib/libc/net/res_comp.c 22 Mar 2002 21:52:29 -0000 1.17 +++ lib/libc/net/res_comp.c 29 Mar 2003 16:42:57 -0000 @@ -142,7 +142,7 @@ * is not careful about this, but for some reason, we're doing it right here. */ #define PERIOD 0x2e -#define hyphenchar(c) ((c) == 0x2d) +#define hyphenchar(c) ((c) == 0x2d || (c) == 0x5f) #define bslashchar(c) ((c) == 0x5c) #define periodchar(c) ((c) == PERIOD) #define asterchar(c) ((c) == 0x2a) --=-=-=-- From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 12:41:06 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7448B37B401 for ; Sat, 29 Mar 2003 12:41:06 -0800 (PST) Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101]) by mx1.FreeBSD.org (Postfix) with ESMTP id BF45843F3F for ; Sat, 29 Mar 2003 12:41:05 -0800 (PST) (envelope-from dan@dan.emsphone.com) Received: (from dan@localhost) by dan.emsphone.com (8.12.7/8.12.7) id h2TKf4Qw027665; Sat, 29 Mar 2003 14:41:04 -0600 (CST) (envelope-from dan) Date: Sat, 29 Mar 2003 14:41:04 -0600 From: Dan Nelson To: Dag-Erling Smorgrav Message-ID: <20030329204104.GF74971@dan.emsphone.com> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-OS: FreeBSD 5.0-CURRENT X-message-flag: Outlook Error User-Agent: Mutt/1.5.4i cc: arch@freebsd.org Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 20:41:31 -0000 In the last episode (Mar 29), Dag-Erling Smorgrav said: > The attached patch, inspired by a discussion on -STABLE, modifies our > resolver library to allow underscores in host names, by classifying > the underscore as a hyphen character. Even though RFC952 forbids > them, underscores are becoming increasingly common in DNS, and they > are sometimes used for mechanisms (such as Microsoft's automatic proxy > configuration scheme) which we might want to support in FreeBSD. I thought proxy autodetect used wpad.domainname.com or looked up http://domainname.com/wpad.dat ? All the XP machines here do that. -- Dan Nelson dnelson@allantgroup.com From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 12:50:52 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 9484F37B401 for ; Sat, 29 Mar 2003 12:50:52 -0800 (PST) Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1494A43F93 for ; Sat, 29 Mar 2003 12:50:52 -0800 (PST) (envelope-from julian@elischer.org) Received: from interjet.elischer.org (12-232-168-4.client.attbi.com[12.232.168.4]) by rwcrmhc51.attbi.com (rwcrmhc51) with ESMTP id <20030329205051051000o6dne>; Sat, 29 Mar 2003 20:50:51 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA80861; Sat, 29 Mar 2003 12:50:48 -0800 (PST) Date: Sat, 29 Mar 2003 12:50:46 -0800 (PST) From: Julian Elischer To: Terry Lambert In-Reply-To: <3E852ABD.E77EA566@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org cc: Scott Long cc: Daniel Eischen Subject: Re: 1:1 threading. X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 20:50:55 -0000 On Fri, 28 Mar 2003, Terry Lambert wrote: > Julian Elischer wrote: > > I'm happy to see the limit of (NKSEs !> NCPU) lifted for processes that > > are in some way identified as 1:1 mode processes.. > > I don't want to lift it for KSE mode processes however. > > > > For system scope threads, I guess you just allocate a separate KSEGRP > > so it has somewhere to store pertinent info. > > > > that makes it rather simple > > system scope threads have a thread, a KSE and a KSEGRP > > process scope threads just use the existing KSEGRP. > > > > Everythiong should just "fall out correctly" by doing this.. > > Except that means for process scope threads, you don't get SMP > scalability, since the single KSEGRP binds them all to a single > CPU... right? no > > -- Terry > From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 14:02:44 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B119237B401; Sat, 29 Mar 2003 14:02:44 -0800 (PST) Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by mx1.FreeBSD.org (Postfix) with ESMTP id C431A43FAF; Sat, 29 Mar 2003 14:02:43 -0800 (PST) (envelope-from phk@phk.freebsd.dk) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2TM2YSM010262; Sat, 29 Mar 2003 23:02:38 +0100 (CET) (envelope-from phk@phk.freebsd.dk) To: Wes Peters From: "Poul-Henning Kamp" In-Reply-To: Your message of "Fri, 28 Mar 2003 09:10:32 PST." <200303280910.32307.wes@softweyr.com> Date: Sat, 29 Mar 2003 23:02:34 +0100 Message-ID: <10261.1048975354@critter.freebsd.dk> cc: David Schultz cc: freebsd-arch@FreeBSD.ORG Subject: Re: Patch to protect process from pageout killing X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 22:02:45 -0000 In message <200303280910.32307.wes@softweyr.com>, Wes Peters writes: >I've reworked my patch to use the madvise(2) syscall, like the original >4.x patch did. I've even documented it, in a man page of all places. >Please see attached patch. If nobody objects, I'll commit sometime this >weekend. I'm still not certain about the inheritance of this, do we want/is it inherited ? Also, thinking about it, on at least a handful of machines I would have more use for MADV_KILLMEFIRST having the exact opposite behaviour. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 15:34:40 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id BA21837B401 for ; Sat, 29 Mar 2003 15:34:40 -0800 (PST) Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id CCEB443FE5 for ; Sat, 29 Mar 2003 15:34:39 -0800 (PST) (envelope-from imp@bsdimp.com) Received: from localhost (warner@rover2.village.org [10.0.0.1]) by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2TNYbA7016298 for ; Sat, 29 Mar 2003 16:34:37 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Sat, 29 Mar 2003 16:33:43 -0700 (MST) Message-Id: <20030329.163343.53040416.imp@bsdimp.com> To: arch@freebsd.org From: "M. Warner Losh" X-Mailer: Mew version 2.1 on Emacs 21.2 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Subject: depend + all vs dependall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 23:34:42 -0000 NetBSD created a dependall target some time ago. This target does a make depend and then a make all so they only have to traverse the tree once for these two stages rather than twice. The time of a buildworld came up in a discussion recently and I thought I'd see how hard it would be to do something similar in FreeBSD. Here are my preliminary results. Machine: Dell Inspiron 8000, 256M RAM, P3-700 time make buildworld (2:04:34 wall time, didn't save the actual output :-(. Machine: Dual Athlon XP2000+ 1.5G RAM aac controller. time make buildworld -j 8 -s run 0: did the above to 'flush the caches/load the sources in ram' Pre-change: Run 1: 1941.458u 723.640s 32:23.67 137.1% 2747+2215k 1447+145802io 465pf+0w Run 2: 1942.160u 729.972s 31:45.84 140.2% 2748+2212k 1423+145755io 465pf+0w After Changes: Run 1: 1922.767u 723.847s 30:48.64 143.1% 2785+2201k 1312+148256io 465pf+0w Run 2: 1922.661u 725.477s 30:49.99 143.1% 2788+2201k 1378+148489io 465pf+0w So it looks like it saves a little over a minute out of 32 (1925s average vs 1849s average, or almost a 4% reduction) on my big build box. My only concern with the patches is that they might interact badly with a bug I remember from the FreeBSD 1.1R days, but can't reproduce, in make. Once upon a time, 'make depend all' was different than 'make depend && make all' because the .depend files weren't re-read after the depend phase, but before the all phase, whereas two makes this would be the case. Since this change combines the two, I'm a little worried about that. Is that still a bug in FreeBSD's make? It won't matter for a pure, virgin tree, but might for incremental builds... Comments? Warner http://perforce.freebsd.org/chv.cgi?CH=27577 Change 27577 by imp@imp_hammer on 2003/03/29 11:24:15 create a new dependall target. # I don't know if the ancient bug about depend is fixed or not. Affected files ... .. //depot/user/imp/freebsd-imp/Makefile#14 edit .. //depot/user/imp/freebsd-imp/Makefile.inc1#18 edit .. //depot/user/imp/freebsd-imp/share/mk/bsd.README#3 edit .. //depot/user/imp/freebsd-imp/share/mk/bsd.dep.mk#3 edit .. //depot/user/imp/freebsd-imp/share/mk/bsd.subdir.mk#2 edit Differences ... ==== //depot/user/imp/freebsd-imp/Makefile#14 (text+ko) ==== @@ -89,8 +89,8 @@ # order, but that's not important. # TGTS= all all-man buildkernel buildtools buildworld checkdpadd clean \ - cleandepend cleandir depend distribute distributeworld everything \ - hierarchy install installcheck installkernel \ + cleandepend cleandir depend dependall distribute distributeworld \ + everything hierarchy install installcheck installkernel \ reinstallkernel installmost installworld libraries lint maninstall \ mk most obj objlink regress rerelease tags update @@ -189,8 +189,7 @@ @echo "--------------------------------------------------------------" @cd ${.CURDIR}/usr.bin/make; \ ${MMAKE} obj && \ - ${MMAKE} depend && \ - ${MMAKE} all && \ + ${MMAKE} dependall && \ ${MMAKE} install DESTDIR=${MAKEPATH} BINDIR= # ==== //depot/user/imp/freebsd-imp/Makefile.inc1#18 (text+ko) ==== @@ -32,7 +32,7 @@ # # Standard targets (not defined here) are documented in the makefiles in # /usr/share/mk. These include: -# obj depend all install clean cleandepend cleanobj +# obj depend dependall all install clean cleandepend cleanobj # Put initial settings here. SUBDIR= @@ -319,18 +319,12 @@ @echo ">>> stage 4: building libraries" @echo "--------------------------------------------------------------" cd ${.CURDIR}; ${WMAKE} -DNOHTML -DNOINFO -DNOMAN -DNOFSCHG libraries -_depend: - @echo - @echo "--------------------------------------------------------------" - @echo ">>> stage 4: make dependencies" - @echo "--------------------------------------------------------------" - cd ${.CURDIR}; ${WMAKE} par-depend everything: @echo @echo "--------------------------------------------------------------" @echo ">>> stage 4: building everything.." @echo "--------------------------------------------------------------" - cd ${.CURDIR}; ${WMAKE} all + cd ${.CURDIR}; ${WMAKE} dependall WMAKE_TOOL_TGTS= @@ -341,7 +335,7 @@ .if !defined(SUBDIR_OVERRIDE) WMAKE_TOOL_TGTS+= _cross-tools .endif -WMAKE_TGTS= ${WMAKE_TOOL_TGTS} _includes _libraries _depend everything +WMAKE_TGTS= ${WMAKE_TOOL_TGTS} _includes _libraries everything buildworld: ${WMAKE_TGTS} .ORDER: ${WMAKE_TGTS} @@ -501,7 +495,7 @@ ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} obj # XXX - Gratuitously builds aicasm in the ``makeoptions NO_MODULES'' case. .if !defined(MODULES_WITH_WORLD) && !defined(NO_MODULES) && exists(${KRNLSRCDIR}/modules) -.for target in obj depend all +.for target in obj dependall cd ${.CURDIR}/sys/modules/aic7xxx/aicasm; \ MAKEOBJDIRPREFIX=${KRNLOBJDIR}/${_kernel}/modules \ ${MAKE} -DNO_CPU_CFLAGS ${target} @@ -509,10 +503,11 @@ .endif .if !defined(NO_KERNELDEPEND) cd ${KRNLOBJDIR}/${_kernel}; \ - ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} depend -DNO_MODULES_OBJ -.endif + ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} dependall -DNO_MODULES_OBJ +.else cd ${KRNLOBJDIR}/${_kernel}; \ ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} all -DNO_MODULES_OBJ +.endif @echo "--------------------------------------------------------------" @echo ">>> Kernel build for ${_kernel} completed on `LC_ALL=C date`" @echo "--------------------------------------------------------------" @@ -620,8 +615,7 @@ ${ECHODIR} "===> ${_tool}"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ - ${MAKE} DIRPRFX=${_tool}/ depend; \ - ${MAKE} DIRPRFX=${_tool}/ all; \ + ${MAKE} DIRPRFX=${_tool}/ dependall; \ ${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX} install .endfor @@ -681,8 +675,7 @@ ${ECHODIR} "===> ${_tool}"; \ cd ${.CURDIR}/${_tool}; \ ${MAKE} DIRPRFX=${_tool}/ obj; \ - ${MAKE} DIRPRFX=${_tool}/ depend; \ - ${MAKE} DIRPRFX=${_tool}/ all; \ + ${MAKE} DIRPRFX=${_tool}/ dependall; \ ${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX} install .endfor @@ -762,8 +755,7 @@ .if exists(${.CURDIR}/${_lib}) ${ECHODIR} "===> ${_lib}"; \ cd ${.CURDIR}/${_lib}; \ - ${MAKE} DIRPRFX=${_lib}/ depend; \ - ${MAKE} DIRPRFX=${_lib}/ all; \ + ${MAKE} DIRPRFX=${_lib}/ dependall; \ ${MAKE} DIRPRFX=${_lib}/ install .endif .endfor @@ -782,7 +774,7 @@ _prebuild_libs: ${_prebuild_libs:S/$/__L/} _generic_libs: ${_generic_libs:S/$/__L/} -.for __target in clean cleandepend cleandir depend includes obj +.for __target in clean cleandepend cleandir depend dependall includes obj .for entry in ${SUBDIR} ${entry}.${__target}__D: .PHONY @if test -d ${.CURDIR}/${entry}.${MACHINE_ARCH}; then \ ==== //depot/user/imp/freebsd-imp/share/mk/bsd.README#3 (text+ko) ==== @@ -169,6 +169,8 @@ depend: make the dependencies for the source files, and store them in the file .depend. + dependall: + make depend then make all install: install the program and its manual pages; if the Makefile does not itself define the target install, the targets ==== //depot/user/imp/freebsd-imp/share/mk/bsd.dep.mk#3 (text+ko) ==== @@ -31,6 +31,9 @@ # Make the dependencies for the source files, and store # them in the file ${DEPENDFILE}. # +# dependall: +# make depend and then all +# # tags: # In "ctags" mode, create a tags file for the source files. # In "gtags" mode, create a (GLOBAL) gtags file for the @@ -183,3 +186,7 @@ echo "LDADD -> $$ldadd1" ; \ fi .endif + +.PHONY: dependall +.ORDER: afterdepend all +dependall: depend all ==== //depot/user/imp/freebsd-imp/share/mk/bsd.subdir.mk#2 (text+ko) ==== @@ -25,8 +25,8 @@ # put the stuff into the right "distribution". # # afterinstall, all, all-man, beforeinstall, checkdpadd, -# clean, cleandepend, cleandir, depend, install, lint, maninstall, -# obj, objlink, realinstall, regress, tags +# clean, cleandepend, cleandir, depend, dependall, install, lint, +# maninstall, obj, objlink, realinstall, regress, tags # .include @@ -67,7 +67,7 @@ .for __target in all all-man checkdpadd clean cleandepend cleandir \ - depend distribute lint maninstall \ + depend dependall distribute lint maninstall \ obj objlink realinstall regress tags ${__target}: _SUBDIR .endfor From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 15:45:00 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 78F7437B401 for ; Sat, 29 Mar 2003 15:45:00 -0800 (PST) Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id A32D043F75 for ; Sat, 29 Mar 2003 15:44:59 -0800 (PST) (envelope-from imp@bsdimp.com) Received: from localhost (warner@rover2.village.org [10.0.0.1]) by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2TNivA7016348; Sat, 29 Mar 2003 16:44:58 -0700 (MST) (envelope-from imp@bsdimp.com) Date: Sat, 29 Mar 2003 16:44:03 -0700 (MST) Message-Id: <20030329.164403.54601077.imp@bsdimp.com> To: des@ofug.org From: "M. Warner Losh" In-Reply-To: References: X-Mailer: Mew version 2.1 on Emacs 21.2 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit cc: arch@freebsd.org Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 29 Mar 2003 23:45:03 -0000 When this has come up in the past, it was decreed that _ is a bad bad bad bad idea, even though people want it. You might want to check the ancient archives (1998?) for all the reasons why. Warner From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 16:18:41 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C2D1C37B401 for ; Sat, 29 Mar 2003 16:18:41 -0800 (PST) Received: from eden.barryp.org (host-150-32-220-24.midco.net [24.220.32.150]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0C7B843FA3 for ; Sat, 29 Mar 2003 16:18:41 -0800 (PST) (envelope-from bp@barryp.org) Received: from [10.66.0.248] (helo=barryp.org) by eden.barryp.org with esmtp (Exim 4.10) id 18zQX0-0004qn-00 for arch@freebsd.org; Sat, 29 Mar 2003 18:18:38 -0600 Message-ID: <3E8637DE.3080003@barryp.org> Date: Sat, 29 Mar 2003 18:18:38 -0600 From: Barry Pederson User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.3) Gecko/20030312 X-Accept-Language: en-us, en MIME-Version: 1.0 To: arch@freebsd.org References: <20030329204104.GF74971@dan.emsphone.com> In-Reply-To: <20030329204104.GF74971@dan.emsphone.com> Content-Type: text/plain; charset=us-ascii; format=flowed Content-Transfer-Encoding: 7bit X-SpamTrack: NO 62 Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 00:18:43 -0000 Dan Nelson wrote: > In the last episode (Mar 29), Dag-Erling Smorgrav said: > >>The attached patch, inspired by a discussion on -STABLE, modifies our >>resolver library to allow underscores in host names, by classifying >>the underscore as a hyphen character. Even though RFC952 forbids >>them, underscores are becoming increasingly common in DNS, and they >>are sometimes used for mechanisms (such as Microsoft's automatic proxy >>configuration scheme) which we might want to support in FreeBSD. > > > I thought proxy autodetect used wpad.domainname.com or looked up > http://domainname.com/wpad.dat ? All the XP machines here do that. The underscore in DNS names is showing up in things like RFC2872 (A DNS RR for specifying the location of services), and "DNS-based Service discovery" as found in Zeroconf/Rendezvous. Barry From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 16:24:49 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 99C2137B401 for ; Sat, 29 Mar 2003 16:24:49 -0800 (PST) Received: from obsecurity.dyndns.org (adsl-63-207-60-150.dsl.lsan03.pacbell.net [63.207.60.150]) by mx1.FreeBSD.org (Postfix) with ESMTP id F1C2343F75 for ; Sat, 29 Mar 2003 16:24:48 -0800 (PST) (envelope-from kris@obsecurity.org) Received: from rot13.obsecurity.org (rot13.obsecurity.org [10.0.0.5]) by obsecurity.dyndns.org (Postfix) with ESMTP id BACFA66E05; Sat, 29 Mar 2003 16:24:48 -0800 (PST) Received: by rot13.obsecurity.org (Postfix, from userid 1000) id 9F9771298; Sat, 29 Mar 2003 16:24:48 -0800 (PST) Date: Sat, 29 Mar 2003 16:24:48 -0800 From: Kris Kennaway To: "M. Warner Losh" Message-ID: <20030330002448.GA32150@rot13.obsecurity.org> References: <20030329.163343.53040416.imp@bsdimp.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="4Ckj6UjgE2iN1+kY" Content-Disposition: inline In-Reply-To: <20030329.163343.53040416.imp@bsdimp.com> User-Agent: Mutt/1.4i cc: arch@freebsd.org Subject: Re: depend + all vs dependall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 00:24:50 -0000 --4Ckj6UjgE2iN1+kY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline On Sat, Mar 29, 2003 at 04:33:43PM -0700, M. Warner Losh wrote: > My only concern with the patches is that they might interact badly > with a bug I remember from the FreeBSD 1.1R days, but can't reproduce, > in make. Once upon a time, 'make depend all' was different than 'make > depend && make all' because the .depend files weren't re-read after > the depend phase, but before the all phase, whereas two makes this > would be the case. Since this change combines the two, I'm a little > worried about that. Is that still a bug in FreeBSD's make? It won't > matter for a pure, virgin tree, but might for incremental builds... I'm pretty sure that's still true. Kris --4Ckj6UjgE2iN1+kY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (FreeBSD) iD8DBQE+hjlQWry0BWjoQKURAvSyAJ4/h2XKIIvBigu3+3IKhIC/vCm1AACgvdRH 2fHDR+FDgOiO8yJT6UkEAks= =As70 -----END PGP SIGNATURE----- --4Ckj6UjgE2iN1+kY-- From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 17:40:48 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6C81437B401 for ; Sat, 29 Mar 2003 17:40:48 -0800 (PST) Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net [207.217.120.218]) by mx1.FreeBSD.org (Postfix) with ESMTP id ECA4143F75 for ; Sat, 29 Mar 2003 17:40:47 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0277.cvx40-bradley.dialup.earthlink.net ([216.244.43.22] helo=mindspring.com) by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18zRoT-0006uf-00; Sat, 29 Mar 2003 17:40:45 -0800 Message-ID: <3E864AD1.6C1C3656@mindspring.com> Date: Sat, 29 Mar 2003 17:39:29 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= References: Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4f14f7a297e07b0d41e6a83a345408d90a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c cc: arch@freebsd.org Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 01:40:50 -0000 Dag-Erling Sm=F8rgrav wrote: > The attached patch, inspired by a discussion on -STABLE, modifies our > resolver library to allow underscores in host names, by classifying > the underscore as a hyphen character. Even though RFC952 forbids > them, underscores are becoming increasingly common in DNS, and they > are sometimes used for mechanisms (such as Microsoft's automatic proxy > configuration scheme) which we might want to support in FreeBSD. There was a better patch that made it an option in resolv.conf, rather than turning it on all the time. FreeBSD should be standards compliant, by default, and take work to make it possible to give bogus data to other hosts on the Internet who can not handle "_" or other characters because they *are* standars compliant. "Be conservative in what you send." -- Terry From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 17:55:16 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id C951137B405 for ; Sat, 29 Mar 2003 17:55:16 -0800 (PST) Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net [207.217.120.139]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3CA8D43FBF for ; Sat, 29 Mar 2003 17:55:16 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0277.cvx40-bradley.dialup.earthlink.net ([216.244.43.22] helo=mindspring.com) by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18zS2N-0007KY-00; Sat, 29 Mar 2003 17:55:08 -0800 Message-ID: <3E864E2F.BA16F6B5@mindspring.com> Date: Sat, 29 Mar 2003 17:53:51 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Barry Pederson References: <3E8637DE.3080003@barryp.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4f0e6d9741b447bd7c23b04c6dac52717a7ce0e8f8d31aa3f350badd9bab72f9c350badd9bab72f9c cc: arch@freebsd.org Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 01:55:20 -0000 Barry Pederson wrote: > > I thought proxy autodetect used wpad.domainname.com or looked up > > http://domainname.com/wpad.dat ? All the XP machines here do that. > > The underscore in DNS names is showing up in things like RFC2872 (A DNS RR > for specifying the location of services), and "DNS-based Service discovery" > as found in Zeroconf/Rendezvous. Excuse me, but that *particular* underscore is a namespace escape, and is used *precisely* so that it does *NOT* ever match a valid host name. People who want the resource records for specific services are supposed to use a service lookup API, rahter than a host name lookup API. Please read the working group documentation for Zeroconf. Thanks, -- Terry "A big fan of zeroconf and the death of DHCP" Lambert From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 18:06:03 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 11BCC37B401 for ; Sat, 29 Mar 2003 18:06:03 -0800 (PST) Received: from whizzo.transsys.com (whizzo.TransSys.COM [144.202.42.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 4D98A43FBF for ; Sat, 29 Mar 2003 18:06:02 -0800 (PST) (envelope-from louie@whizzo.transsys.com) Received: from whizzo.transsys.com (#6@localhost [127.0.0.1]) by whizzo.transsys.com (8.12.8/8.12.7) with ESMTP id h2U25vDN037209; Sat, 29 Mar 2003 21:05:57 -0500 (EST) (envelope-from louie@whizzo.transsys.com) Message-Id: <200303300205.h2U25vDN037209@whizzo.transsys.com> X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Terry Lambert X-Image-URL: http://www.transsys.com/louie/images/louie-mail.jpg From: "Louis A. Mamakos" References: <3E864AD1.6C1C3656@mindspring.com> In-reply-to: Your message of "Sat, 29 Mar 2003 17:39:29 PST." <3E864AD1.6C1C3656@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Date: Sat, 29 Mar 2003 21:05:57 -0500 Sender: louie@TransSys.COM cc: arch@freebsd.org cc: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 02:06:04 -0000 > Dag-Erling Sm=F8rgrav wrote: > > The attached patch, inspired by a discussion on -STABLE, modifies our= > > resolver library to allow underscores in host names, by classifying > > the underscore as a hyphen character. Even though RFC952 forbids > > them, underscores are becoming increasingly common in DNS, and they > > are sometimes used for mechanisms (such as Microsoft's automatic prox= y > > configuration scheme) which we might want to support in FreeBSD. > = > = > There was a better patch that made it an option in resolv.conf, > rather than turning it on all the time. This is great, except that you'd don't need to have a resolv.conf on your system at all; the resolver will default to using a local caching nameserver. > FreeBSD should be standards compliant, by default, and take work > to make it possible to give bogus data to other hosts on the > Internet who can not handle "_" or other characters because they > *are* standars compliant. Since this is a resolver option, you're not handing out names to other hosts using the DNS infrastructure. > "Be conservative in what you send." And liberal in what you receive, which is exactly what modifing the resolver to not cause gethostbyname() and it's ilk to barf on these types of names. There are lots of things in ancient RFCs which probably do not make as much sense these days as they once did. If there is a security issue in applications, they should get fixed regardless. All this heartburn over what the gethostbyname() library function chooses to believe from the DNS still doesn't address getting hostnames out of NIS or /etc/hosts. louie From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 18:19:47 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E7D437B404 for ; Sat, 29 Mar 2003 18:19:47 -0800 (PST) Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net [207.217.120.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id D2CC643F75 for ; Sat, 29 Mar 2003 18:19:46 -0800 (PST) (envelope-from tlambert2@mindspring.com) Received: from pool0277.cvx40-bradley.dialup.earthlink.net ([216.244.43.22] helo=mindspring.com) by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128) (Exim 3.33 #1) id 18zSQ4-0004KP-00; Sat, 29 Mar 2003 18:19:37 -0800 Message-ID: <3E8653EA.BAF9D765@mindspring.com> Date: Sat, 29 Mar 2003 18:18:18 -0800 From: Terry Lambert X-Mailer: Mozilla 4.79 [en] (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Louis A. Mamakos" References: <3E864AD1.6C1C3656@mindspring.com> <200303300205.h2U25vDN037209@whizzo.transsys.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4f408a3acb4ae55e09c4b5c7b5d71a30e3ca473d225a0f487350badd9bab72f9c350badd9bab72f9c cc: arch@freebsd.org cc: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= Subject: Re: Allow underscores in DNS names X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 02:19:49 -0000 "Louis A. Mamakos" wrote: > > There was a better patch that made it an option in resolv.conf, > > rather than turning it on all the time. > > This is great, except that you'd don't need to have a resolv.conf > on your system at all; the resolver will default to using a local > caching nameserver. By this argument, it should do that anyway, if the only option is this one. My own argument is that there should be an "allow_chars" option in the resolv.conf, so that the Tuesday after this is committed, and someone now wants "#" in domain names to support their idea of mapping phone numbers to domain names, we don't have to go through this whole dumb "let's violate RFC-952, just this once!" argument yet againt. > > FreeBSD should be standards compliant, by default, and take work > > to make it possible to give bogus data to other hosts on the > > Internet who can not handle "_" or other characters because they > > *are* standars compliant. > > Since this is a resolver option, you're not handing out names to > other hosts using the DNS infrastructure. You are if you are a caching DNS server, which uses the resolver code to look up data on the global DNS, caches it, and returns it to local DNS querants. It also permits you to do things like put "_" in names in host files. If you *must* have a single patch, at *least* the original original patch (which *also* failed to provide an option for unbreaking RFC-952 compliance on the systems of people who prefer to comply with international standards) only allowed the character *interior* to the domain names (i.e. after the first character). That, *at least* hept it from interfering accidently with the service location resource records for zeroconf. > > "Be conservative in what you send." > > And liberal in what you receive, which is exactly what modifing > the resolver to not cause gethostbyname() and it's ilk to barf > on these types of names. And liberal in what you resend? You can't have it both ways. Reading the 1998 discussion, as was previously suggested, is a good idea. > There are lots of things in ancient RFCs which probably do not > make as much sense these days as they once did. There is a fix for that: join an IETF group, and create a "supercedes" RFC. The standards are the standards, as they are. > If there is a security issue in applications, they should get > fixed regardless. OK. So you are advocating getting rid of the stupid "This program uses gets(), which is unsafe" messages, right? Because the programs where the API that is being used lead to a security isseu in applications, when people do not know how to use the API properly. > All this heartburn over what the gethostbyname() library function > chooses to believe from the DNS still doesn't address getting > hostnames out of NIS or /etc/hosts. NIS and /etc/hosts should *NEVER* contain a host name with an "_". *NEVER*. -- Terry From owner-freebsd-arch@FreeBSD.ORG Sat Mar 29 22:12:25 2003 Return-Path: Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id A098737B401 for ; Sat, 29 Mar 2003 22:12:25 -0800 (PST) Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by mx1.FreeBSD.org (Postfix) with ESMTP id 65AE843F3F for ; Sat, 29 Mar 2003 22:12:24 -0800 (PST) (envelope-from bde@zeta.org.au) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id QAA00052; Sun, 30 Mar 2003 16:12:10 +1000 Date: Sun, 30 Mar 2003 16:12:09 +1000 (EST) From: Bruce Evans X-X-Sender: bde@gamplex.bde.org To: "M. Warner Losh" In-Reply-To: <20030329.163343.53040416.imp@bsdimp.com> Message-ID: <20030330150957.M13638@gamplex.bde.org> References: <20030329.163343.53040416.imp@bsdimp.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII cc: arch@freebsd.org Subject: Re: depend + all vs dependall X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.1 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 30 Mar 2003 06:12:28 -0000 On Sat, 29 Mar 2003, M. Warner Losh wrote: > NetBSD created a dependall target some time ago. This target does a > make depend and then a make all so they only have to traverse the tree > once for these two stages rather than twice. The time of a buildworld > came up in a discussion recently and I thought I'd see how hard it > would be to do something similar in FreeBSD. Here are my preliminary > results. > > Machine: Dell Inspiron 8000, 256M RAM, P3-700 > time make buildworld > (2:04:34 wall time, didn't save the actual output :-(. > > Machine: Dual Athlon XP2000+ 1.5G RAM aac controller. > > time make buildworld -j 8 -s Note that all benchmarks using -j are invalid because of nondeterministic wait times of up to 100 msec for each job. This pessimizes makeworld -j 4 times by about 20% on a non-dual Athlon XP1600, and can't do good things for the variance. The pessimization is larger on faster machines of course. This is fixed in NetBSD. FreeBSD only has the hackaround of reducing the timeout from 500 msec to 100 msec. > run 0: did the above to 'flush the caches/load the sources in ram' > > Pre-change: > > Run 1: > 1941.458u 723.640s 32:23.67 137.1% 2747+2215k 1447+145802io 465pf+0w > Run 2: > 1942.160u 729.972s 31:45.84 140.2% 2748+2212k 1423+145755io 465pf+0w The SMP overheads seem to be very large. I get the following times on a non-Dual Athlon XP1600 overclocked 256MB RAM ide controller 2 drives: %%% -------------------------------------------------------------- >>> elf make world completed on Sun Mar 2 16:30:55 EST 2003 (started Sun Mar 2 15:53:15 EST 2003) -------------------------------------------------------------- 2260.31 real 1729.55 user 326.24 sys %%% This machine had lost 256MB of its RAM at the time of the above benchmark (the latest one that I have with no local changes to the src tree). Losing 256MB cost it a 100-200 seconds. Upgrading to 1024 MB RAM improved on its old speed of 1967 seconds to 1943 seconds (both of these times with local changes). The disk cache is cold in all of my makeworld benchmarks. A warm cache wouldn't have much helped with 512MB RAM since that is not quite enough to cache the src tree, but it reduces the makeworld times a little more with 1024 MB RAM. > After Changes: > > Run 1: > 1922.767u 723.847s 30:48.64 143.1% 2785+2201k 1312+148256io 465pf+0w > Run 2: > 1922.661u 725.477s 30:49.99 143.1% 2788+2201k 1378+148489io 465pf+0w > > So it looks like it saves a little over a minute out of 32 (1925s > average vs 1849s average, or almost a 4% reduction) on my big build > box. It is a bug for make depend to be run at all in the default (non-NOCLEAN) case. My commits for this got clobbered, but I still use them here. This seems to save only about 5% currently (down from 10% when the changes were committed in 1998). > My only concern with the patches is that they might interact badly > with a bug I remember from the FreeBSD 1.1R days, but can't reproduce, > in make. Once upon a time, 'make depend all' was different than 'make > depend && make all' because the .depend files weren't re-read after > the depend phase, but before the all phase, whereas two makes this > would be the case. Since this change combines the two, I'm a little > worried about that. Is that still a bug in FreeBSD's make? It won't > matter for a pure, virgin tree, but might for incremental builds... This is not a bug, but is how make works. It shouldn't be a problem if dependall is implemented correctly. dependall should avoid the double tree traversal but somehow build "depend" and "all" sequentially in leaf directories. > ==== //depot/user/imp/freebsd-imp/share/mk/bsd.README#3 (text+ko) ==== > > @@ -169,6 +169,8 @@ > depend: > make the dependencies for the source files, and store > them in the file .depend. > + dependall: > + make depend then make all This at least describes a correct implementation :-). > install: > install the program and its manual pages; if the Makefile > does not itself define the target install, the targets > > ==== //depot/user/imp/freebsd-imp/share/mk/bsd.dep.mk#3 (text+ko) ==== > > @@ -31,6 +31,9 @@ > # Make the dependencies for the source files, and store > # them in the file ${DEPENDFILE}. > # > +# dependall: > +# make depend and then all > +# The wording is different from that in the README, and is poor in both places. > # tags: > # In "ctags" mode, create a tags file for the source files. > # In "gtags" mode, create a (GLOBAL) gtags file for the > @@ -183,3 +186,7 @@ > echo "LDADD -> $$ldadd1" ; \ > fi > .endif > + > +.PHONY: dependall > +.ORDER: afterdepend all > +dependall: depend all .PHONY doesn't work right with BSD make, and is not use for any of the other phony depend targets in FreeBSD. The dependencies seem to be correct, but I think it's a style bug to have afterdepend in the .ORDER statement instead of "depend", at least in FreeBSD. afterdepend isn't actually done after "depend"; "depend" depends on afterdepend so the latter is part of the former (this is another style bug). Bruce