From owner-freebsd-arch  Mon Mar 24  4:59:41 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 06A6137B401
	for <arch@freebsd.org>; Mon, 24 Mar 2003 04:59:38 -0800 (PST)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 311B643FB1
	for <arch@freebsd.org>; Mon, 24 Mar 2003 04:59:37 -0800 (PST)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2OCxZhV005239
	for <arch@freebsd.org>; Mon, 24 Mar 2003 13:59:35 +0100 (CET)
	(envelope-from phk@phk.freebsd.dk)
To: arch@freebsd.org
Subject: moving GEOM around...
From: Poul-Henning Kamp <phk@phk.freebsd.dk>
Date: Mon, 24 Mar 2003 13:59:35 +0100
Message-ID: <5238.1048510775@critter.freebsd.dk>
X-Spam-Status: No, hits=0.0 required=5.0
	tests=none
	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


A number of people have suggested that the directory layout of GEOM
sources should be changed.  The main complaint seems to be that sys/geom
contains both subdirectories (bde) and source files.

I personally don't particularly care about that, and as a matter
of fact wasn't even aware that was a rule, but if a significant
number of people think this is wrong I'm willing to repo-copy things
around and fix it, therefore this strawpoll:

Option 1:  No change

Option 2:
	sys/
		geom/
			infra/	
				geom_io.c
				geom_event.c
				...
			bsd/
				geom_bsd.c
			mbr/
				geom_mbr.c
			sunlabel/
				geom_sunlabel.c
			gbde/
				g_bde.c
				g_bde_crypt.c
				...
			...

Option 3:
	sys/
		geom/
			infra/
				geom_io.c
				geom_event.c
				...
			class/
				# contains methods implemented in a single
				# source file
				geom_bsd.c
				geom_mbr.c
				geom_sunlabel.c
				...
			gbde/
				# classes implemented in multiple source
				# files get a subdirectory of their own.

Straw votes in private email please...

I'll draw whatever concensus opinion I can from the emails I get,
and then I'll send the proposal to cvs@ who may at that time shoot
it down as unnecessary repo-bloat.

Poul-Henning

PS: I'm not inclined to entertain a long bikeshed on this issue,
on the more general topic of source tree re-layout and the need for
a democratic process for determing the location of all future files
in our cvs repo.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24  5:10:11 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 51DAA37B401; Mon, 24 Mar 2003 05:10:09 -0800 (PST)
Received: from k6.locore.ca (k6.locore.ca [198.96.117.170])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 2952943F75; Mon, 24 Mar 2003 05:10:08 -0800 (PST)
	(envelope-from jake@k6.locore.ca)
Received: from k6.locore.ca (localhost.locore.ca [127.0.0.1])
	by k6.locore.ca (8.12.8/8.12.8) with ESMTP id h2ODFcxS078938;
	Mon, 24 Mar 2003 08:15:38 -0500 (EST)
	(envelope-from jake@k6.locore.ca)
Received: (from jake@localhost)
	by k6.locore.ca (8.12.8/8.12.8/Submit) id h2ODFcnb078937;
	Mon, 24 Mar 2003 08:15:38 -0500 (EST)
Date: Mon, 24 Mar 2003 08:15:38 -0500
From: Jake Burkholder <jake@locore.ca>
To: John Baldwin <jhb@FreeBSD.ORG>
Cc: arch@FreeBSD.ORG
Subject: Re: Convert process at_foo events to eventhandlers
Message-ID: <20030324081538.Y76446@locore.ca>
References: <XFMail.20030321151929.jhb@FreeBSD.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <XFMail.20030321151929.jhb@FreeBSD.org>; from jhb@FreeBSD.ORG on Fri, Mar 21, 2003 at 03:19:29PM -0500
X-Spam-Status: No, hits=-29.0 required=5.0
	tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Apparently, On Fri, Mar 21, 2003 at 03:19:29PM -0500,
	John Baldwin said words to the effect of;

> I'd like to convert the process at_fork, at_exec, and at_exit
> events to be regular eventhandlers instead.  This way I get to
> leverage the locking of the existing eventhandlers w/o having
> to duplicate it in three other places.  The patch to do this is
> at http://www.FreeBSD.org/~jhb/patches/proc_event.patch.
> Note that the old API (at_foo, rm_at_foo) has been removed as
> I can not easily implement the rm_at_foo functionality using
> eventhandlers since eventhandlers allow for multiple instances
> of a function in a list and use cookies instead of using the
> function pointer directly to remove events.  There is precedent
> for this in that at_shutdown() also died when at_shutdown() was
> converted to an eventhandler.  This patch also defines some
> generic eventhandler priorities so that users of eventhandlers
> don't always have to define new constants for priorities.
> 
> Comments?

Do it!

Jake

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24  7:22:32 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4BFD637B401
	for <arch@freebsd.org>; Mon, 24 Mar 2003 07:22:27 -0800 (PST)
Received: from mail.nsu.ru (mx.nsu.ru [193.124.215.71])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D6AA343F3F
	for <arch@freebsd.org>; Mon, 24 Mar 2003 07:22:25 -0800 (PST)
	(envelope-from fjoe@iclub.nsu.ru)
Received: from drweb by mail.nsu.ru with drweb-scanned (Exim 3.20 #1)
	id 18xTmH-0006BX-00
	for arch@freebsd.org; Mon, 24 Mar 2003 21:22:21 +0600
Received: from iclub.nsu.ru ([193.124.215.97] ident=root)
	by mail.nsu.ru with esmtp (Exim 3.20 #1)
	id 18xTmG-0006BH-00
	for arch@freebsd.org; Mon, 24 Mar 2003 21:22:20 +0600
Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1])
	by iclub.nsu.ru (8.12.8/8.12.8) with ESMTP id h2OFM8j1094599
	for <arch@freebsd.org>; Mon, 24 Mar 2003 21:22:08 +0600 (NS)
	(envelope-from fjoe@iclub.nsu.ru)
Received: (from fjoe@localhost)
	by iclub.nsu.ru (8.12.8/8.12.8/Submit) id h2OFM7MZ094596
	for arch@freebsd.org; Mon, 24 Mar 2003 21:22:07 +0600 (NS)
Date: Mon, 24 Mar 2003 21:22:06 +0600
From: Max Khon <fjoe@iclub.nsu.ru>
To: arch@freebsd.org
Subject: [fjoe@iclub.nsu.ru: Re: thread-safe realpath]
Message-ID: <20030324212205.A94544@iclub.nsu.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
X-Envelope-To: arch@freebsd.org
X-Spam-Status: No, hits=-16.1 required=5.0
	tests=EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

hi, there!

I am forwarding this e-mail to arch@ because standards@ has been
silent on this.

----- Forwarded message from Max Khon <fjoe@iclub.nsu.ru> -----

Date: Fri, 21 Mar 2003 04:32:23 +0600
From: Max Khon <fjoe@iclub.nsu.ru>
To: standards@freebsd.org
Subject: Re: thread-safe realpath

hi, there!

On Fri, Mar 21, 2003 at 03:38:21AM +0600, Max Khon wrote:

> Constantin Svintsoff has submitted thread-safe realpath() implementation
> (implementation that does not use chdir(2)).
> The implementation is feature-compatible with FreeBSD implementation, i.e.
> if the last component of specified path can't be stat'ed and there is no
> trailing slash, realpath succeeds.
> 
> I fixed a couple of bugs in it and would like to commit it to HEAD
> if there will be no objections.
> 
> Test program is attached. The test simply creates two threads and calls
> realpath() in each. If the test is compiled with truepath() #if-0'ed
> one of the assertions fail after some time (you may need to increase
> number of iterations if you have very fast machine, mine is Athlon 850).
> 
> Any comments are highly appreciated.
> Please reply directly (I am not subscribed).

I have also included realpath test from glibc 2.2.2.
Tarball can be found here:

http://people.freebsd.org/~fjoe/realpath.tar.gz

/fjoe

----- End forwarded message -----


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24  7:46:28 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 89F4E37B401
	for <arch@freebsd.org>; Mon, 24 Mar 2003 07:46:25 -0800 (PST)
Received: from numeri.campus.luth.se (numeri.campus.luth.se [130.240.197.103])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7774F43F3F
	for <arch@freebsd.org>; Mon, 24 Mar 2003 07:46:24 -0800 (PST)
	(envelope-from k@numeri.campus.luth.se)
Received: from numeri.campus.luth.se (localhost [127.0.0.1])
	by numeri.campus.luth.se (8.12.8/8.12.7) with ESMTP id h2OFkMho052694;
	Mon, 24 Mar 2003 16:46:22 +0100 (CET)
	(envelope-from k@numeri.campus.luth.se)
Received: (from k@localhost)
	by numeri.campus.luth.se (8.12.8/8.12.7/Submit) id h2OFkMkN052669;
	Mon, 24 Mar 2003 16:46:22 +0100 (CET)
Date: Mon, 24 Mar 2003 16:46:21 +0100
From: Johan Karlsson <k@numeri.campus.luth.se>
To: Max Khon <fjoe@iclub.nsu.ru>
Cc: arch@freebsd.org, Sheldon Hearn <sheldonh@starjuice.net>
Subject: Re: [fjoe@iclub.nsu.ru: Re: thread-safe realpath]
Message-ID: <20030324154621.GA82437@numeri.campus.luth.se>
References: <20030324212205.A94544@iclub.nsu.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030324212205.A94544@iclub.nsu.ru>
User-Agent: Mutt/1.4i
X-Spam-Status: No, hits=-33.1 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Mon, Mar 24, 2003 at 21:22 (+0600) +0000, Max Khon wrote:
> On Fri, Mar 21, 2003 at 03:38:21AM +0600, Max Khon wrote:
> 
> > Constantin Svintsoff has submitted thread-safe realpath() implementation
> > (implementation that does not use chdir(2)).
> > The implementation is feature-compatible with FreeBSD implementation, i.e.
> > if the last component of specified path can't be stat'ed and there is no
> > trailing slash, realpath succeeds.
> > 
> > I fixed a couple of bugs in it and would like to commit it to HEAD
> > if there will be no objections.
> > 
> > Test program is attached. The test simply creates two threads and calls
> > realpath() in each. If the test is compiled with truepath() #if-0'ed
> > one of the assertions fail after some time (you may need to increase
> > number of iterations if you have very fast machine, mine is Athlon 850).
> > 
> > Any comments are highly appreciated.

How does this affect PR 12244?
I've sent a patch to -audit for review a month ago and I'm about
to commit that (just doing a final make universe).
http://docs.freebsd.org/cgi/getmsg.cgi?fetch=0+0+archive/2003/freebsd-audit/20030209.freebsd-audit


/Johan K


> > Please reply directly (I am not subscribed).
> 
> I have also included realpath test from glibc 2.2.2.
> Tarball can be found here:
> 
> http://people.freebsd.org/~fjoe/realpath.tar.gz
> 
> /fjoe
> 
> ----- End forwarded message -----
> 
> 
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-arch" in the body of the message
> 

-- 
Johan Karlsson		mailto:k@numeri.campus.luth.se

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24  8:24: 0 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BBB4A37B401
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 08:23:54 -0800 (PST)
Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1799543FAF
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 08:23:52 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204])
	by smtp-relay.omnis.com (Postfix) with ESMTP
	id A737A42F96; Mon, 24 Mar 2003 08:23:49 -0800 (PST)
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr
To: freebsd-arch@freebsd.org
Subject: Patch to protect process from pageout killing
Date: Mon, 24 Mar 2003 08:23:48 -0800
User-Agent: KMail/1.5
MIME-Version: 1.0
Content-Type: text/plain;
  charset="us-ascii"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303240823.48262.wes@softweyr.com>
X-Spam-Status: No, hits=-6.0 required=5.0
	tests=PATCH_UNIFIED_DIFF,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

As promised, here's the patch to protect a process from being killed when 
pageout is in memory shortage.  This allows a process to specify that it 
is important enough to be skipped when pageout is looking for the largest 
process to kill.

My needs are simple.  We make a box that is a web proxy and runs from a 
memory disk, using flash for permanent storage.  The flash is mounted 
only when a configuration write is needed, the box runs from the memory 
disk.  We've experienced a problem at certain customer sites where bind 
will consume a lot (~30 MB) of ram and then pageout will kill the largest 
process, which is usually either named or squid.  This pretty much kills 
the box.  We'd much rather have pageout kill off some of the squid worker 
processes, we can recover from that.

Is this a good approach to the problem?  Feedback welcome.

--- kern/kern_resource.c.orig	Sun Mar 23 22:12:55 2003
+++ kern/kern_resource.c	Sun Mar 23 22:14:17 2003
@@ -562,12 +562,12 @@
 	}
 
 	switch (which) {
-
 	case RLIMIT_CPU:
 		mtx_lock_spin(&sched_lock);
 		p->p_cpulimit = limp->rlim_cur;
 		mtx_unlock_spin(&sched_lock);
 		break;
+
 	case RLIMIT_DATA:
 		if (limp->rlim_cur > maxdsiz)
 			limp->rlim_cur = maxdsiz;
@@ -625,6 +625,15 @@
 		if (limp->rlim_max < 1)
 			limp->rlim_max = 1;
 		break;
+
+        case RLIMIT_PROTECT:
+                mtx_lock_spin(&sched_lock);
+                if (limp->rlim_cur)
+                        p->p_flag |= P_PROTECTED;
+                else
+                        p->p_flag &= ~P_PROTECTED;
+                mtx_unlock_spin(&sched_lock);
+                break;
 	}
 	*alimp = *limp;
 	return (0);
--- sys/proc.h.orig	Sun Mar 23 21:36:13 2003
+++ sys/proc.h	Sun Mar 23 21:37:56 2003
@@ -629,6 +629,7 @@
 #define	P_EXEC		0x04000	/* Process called exec. */
 #define	P_THREADED	0x08000	/* Process is using threads. */
 #define	P_CONTINUED	0x10000	/* Proc has continued from a stopped state. 
*/
+#define	P_PROTECTED	0x20000	/* Do not kill on memory overcommit. */
 
 /* flags that control how threads may be suspended for some reason */
 #define	P_STOPPED_SIG		0x20000	/* Stopped due to SIGSTOP/SIGTSTP */
--- sys/resource.h.orig	Sun Mar 23 22:07:50 2003
+++ sys/resource.h	Sun Mar 23 22:09:45 2003
@@ -92,8 +92,9 @@
 #define	RLIMIT_NOFILE	8		/* number of open files */
 #define	RLIMIT_SBSIZE	9		/* maximum size of all socket buffers */
 #define RLIMIT_VMEM	10		/* virtual process size (inclusive of mmap) */
+#define	RLIMIT_PROTECT	11		/* protect process from overcommit kill */
 
-#define	RLIM_NLIMITS	11		/* number of resource limits */
+#define	RLIM_NLIMITS	12		/* number of resource limits */
 
 #define	RLIM_INFINITY	((rlim_t)(((u_quad_t)1 << 63) - 1))
 
@@ -115,6 +116,7 @@
 	"nofile",
 	"sbsize",
 	"vmem",
+	"protect",
 };
 #endif
 
--- vm/vm_pageout.c.orig	Sun Mar 23 21:38:19 2003
+++ vm/vm_pageout.c	Sun Mar 23 21:40:15 2003
@@ -1184,9 +1184,10 @@
 			if (PROC_TRYLOCK(p) == 0)
 				continue;
 			/*
-			 * if this is a system process, skip it
+			 * If this is a system or protected process, skip it.
 			 */
 			if ((p->p_flag & P_SYSTEM) || (p->p_pid == 1) ||
+			    (p->p_flag & P_PROTECTED) ||
 			    ((p->p_pid < 48) && (vm_swap_size != 0))) {
 				PROC_UNLOCK(p);
 				continue;


-- 

        Where am I, and what am I doing in this handbasket?

Wes Peters                                               wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24  8:32: 2 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B474637B401
	for <arch@freebsd.org>; Mon, 24 Mar 2003 08:31:57 -0800 (PST)
Received: from mail.nsu.ru (mx.nsu.ru [193.124.215.71])
	by mx1.FreeBSD.org (Postfix) with ESMTP id AF40643F85
	for <arch@freebsd.org>; Mon, 24 Mar 2003 08:31:56 -0800 (PST)
	(envelope-from fjoe@iclub.nsu.ru)
Received: from drweb by mail.nsu.ru with drweb-scanned (Exim 3.20 #1)
	id 18xUqh-0000pb-00; Mon, 24 Mar 2003 22:30:59 +0600
Received: from iclub.nsu.ru ([193.124.215.97] ident=root)
	by mail.nsu.ru with esmtp (Exim 3.20 #1)
	id 18xUqg-0000oO-00; Mon, 24 Mar 2003 22:30:58 +0600
Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1])
	by iclub.nsu.ru (8.12.8/8.12.8) with ESMTP id h2OGUgj1096445;
	Mon, 24 Mar 2003 22:30:42 +0600 (NS)
	(envelope-from fjoe@iclub.nsu.ru)
Received: (from fjoe@localhost)
	by iclub.nsu.ru (8.12.8/8.12.8/Submit) id h2OGUf0g096444;
	Mon, 24 Mar 2003 22:30:42 +0600 (NS)
Date: Mon, 24 Mar 2003 22:30:41 +0600
From: Max Khon <fjoe@iclub.nsu.ru>
To: Johan Karlsson <k@numeri.campus.luth.se>
Cc: arch@freebsd.org, Sheldon Hearn <sheldonh@starjuice.net>
Subject: Re: [fjoe@iclub.nsu.ru: Re: thread-safe realpath]
Message-ID: <20030324223041.A96310@iclub.nsu.ru>
References: <20030324212205.A94544@iclub.nsu.ru> <20030324154621.GA82437@numeri.campus.luth.se>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20030324154621.GA82437@numeri.campus.luth.se>; from k@numeri.campus.luth.se on Mon, Mar 24, 2003 at 04:46:21PM +0100
X-Envelope-To: k@numeri.campus.luth.se,
 arch@freebsd.org,
 sheldonh@starjuice.net
X-Spam-Status: No, hits=-24.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

hi, there!

On Mon, Mar 24, 2003 at 04:46:21PM +0100, Johan Karlsson wrote:

> > > Constantin Svintsoff has submitted thread-safe realpath() implementation
> > > (implementation that does not use chdir(2)).
> > > The implementation is feature-compatible with FreeBSD implementation, i.e.
> > > if the last component of specified path can't be stat'ed and there is no
> > > trailing slash, realpath succeeds.
> > > 
> > > I fixed a couple of bugs in it and would like to commit it to HEAD
> > > if there will be no objections.
> > > 
> > > Test program is attached. The test simply creates two threads and calls
> > > realpath() in each. If the test is compiled with truepath() #if-0'ed
> > > one of the assertions fail after some time (you may need to increase
> > > number of iterations if you have very fast machine, mine is Athlon 850).
> > > 
> > > Any comments are highly appreciated.
> 
> How does this affect PR 12244?
> I've sent a patch to -audit for review a month ago and I'm about
> to commit that (just doing a final make universe).
> http://docs.freebsd.org/cgi/getmsg.cgi?fetch=0+0+archive/2003/freebsd-audit/20030209.freebsd-audit

I do not think that this patch is needed if Constantin's version will
be committed. I plan to do this in about a week.

/fjoe


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24  8:36:30 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 319F337B401
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 08:36:25 -0800 (PST)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5070043FA3
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 08:36:24 -0800 (PST)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2OGaMhV007020;
	Mon, 24 Mar 2003 17:36:22 +0100 (CET)
	(envelope-from phk@phk.freebsd.dk)
To: Wes Peters <wes@softweyr.com>
Cc: freebsd-arch@freebsd.org
Subject: Re: Patch to protect process from pageout killing 
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Mon, 24 Mar 2003 08:23:48 PST."
             <200303240823.48262.wes@softweyr.com> 
Date: Mon, 24 Mar 2003 17:36:22 +0100
Message-ID: <7019.1048523782@critter.freebsd.dk>
X-Spam-Status: No, hits=-9.7 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes:

>As promised, here's the patch to protect a process from being killed when 
>pageout is in memory shortage.  This allows a process to specify that it 
>is important enough to be skipped when pageout is looking for the largest 
>process to kill.
>
>My needs are simple.  We make a box that is a web proxy and runs from a 
>memory disk, using flash for permanent storage.  The flash is mounted 
>only when a configuration write is needed, the box runs from the memory 
>disk.  We've experienced a problem at certain customer sites where bind 
>will consume a lot (~30 MB) of ram and then pageout will kill the largest 
>process, which is usually either named or squid.  This pretty much kills 
>the box.  We'd much rather have pageout kill off some of the squid worker 
>processes, we can recover from that.
>
>Is this a good approach to the problem?  Feedback welcome.

(Ignoring the white-space change)

I can certainly see the point, but I'm not sure this is the way.

I am not sure that we want to use the resource limits facility for
booleans, some of the logic sourounding the suser checks may not
hold tight.

Also, doesn't this result in the flag being inerited with fork() and
thereby negating the effect you are seeking for squid ?

Poul-Henning


>
>--- kern/kern_resource.c.orig	Sun Mar 23 22:12:55 2003
>+++ kern/kern_resource.c	Sun Mar 23 22:14:17 2003
>@@ -562,12 +562,12 @@
> 	}
> 
> 	switch (which) {
>-
> 	case RLIMIT_CPU:
> 		mtx_lock_spin(&sched_lock);
> 		p->p_cpulimit = limp->rlim_cur;
> 		mtx_unlock_spin(&sched_lock);
> 		break;
>+
> 	case RLIMIT_DATA:
> 		if (limp->rlim_cur > maxdsiz)
> 			limp->rlim_cur = maxdsiz;
>@@ -625,6 +625,15 @@
> 		if (limp->rlim_max < 1)
> 			limp->rlim_max = 1;
> 		break;
>+
>+        case RLIMIT_PROTECT:
>+                mtx_lock_spin(&sched_lock);
>+                if (limp->rlim_cur)
>+                        p->p_flag |= P_PROTECTED;
>+                else
>+                        p->p_flag &= ~P_PROTECTED;
>+                mtx_unlock_spin(&sched_lock);
>+                break;
> 	}
> 	*alimp = *limp;
> 	return (0);
>--- sys/proc.h.orig	Sun Mar 23 21:36:13 2003
>+++ sys/proc.h	Sun Mar 23 21:37:56 2003
>@@ -629,6 +629,7 @@
> #define	P_EXEC		0x04000	/* Process called exec. */
> #define	P_THREADED	0x08000	/* Process is using threads. */
> #define	P_CONTINUED	0x10000	/* Proc has continued from a stopped state. 
>*/
>+#define	P_PROTECTED	0x20000	/* Do not kill on memory overcommit. */
> 
> /* flags that control how threads may be suspended for some reason */
> #define	P_STOPPED_SIG		0x20000	/* Stopped due to SIGSTOP/SIGTSTP */
>--- sys/resource.h.orig	Sun Mar 23 22:07:50 2003
>+++ sys/resource.h	Sun Mar 23 22:09:45 2003
>@@ -92,8 +92,9 @@
> #define	RLIMIT_NOFILE	8		/* number of open files */
> #define	RLIMIT_SBSIZE	9		/* maximum size of all socket buffers */
> #define RLIMIT_VMEM	10		/* virtual process size (inclusive of mmap) */
>+#define	RLIMIT_PROTECT	11		/* protect process from overcommit kill */
> 
>-#define	RLIM_NLIMITS	11		/* number of resource limits */
>+#define	RLIM_NLIMITS	12		/* number of resource limits */
> 
> #define	RLIM_INFINITY	((rlim_t)(((u_quad_t)1 << 63) - 1))
> 
>@@ -115,6 +116,7 @@
> 	"nofile",
> 	"sbsize",
> 	"vmem",
>+	"protect",
> };
> #endif
> 
>--- vm/vm_pageout.c.orig	Sun Mar 23 21:38:19 2003
>+++ vm/vm_pageout.c	Sun Mar 23 21:40:15 2003
>@@ -1184,9 +1184,10 @@
> 			if (PROC_TRYLOCK(p) == 0)
> 				continue;
> 			/*
>-			 * if this is a system process, skip it
>+			 * If this is a system or protected process, skip it.
> 			 */
> 			if ((p->p_flag & P_SYSTEM) || (p->p_pid == 1) ||
>+			    (p->p_flag & P_PROTECTED) ||
> 			    ((p->p_pid < 48) && (vm_swap_size != 0))) {
> 				PROC_UNLOCK(p);
> 				continue;
>
>
>-- 
>
>        Where am I, and what am I doing in this handbasket?
>
>Wes Peters                                               wes@softweyr.com
>
>

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 11: 2:55 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3949037B4A4
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 11:02:52 -0800 (PST)
Received: from uitm.zenon.net (uitm.zenon.net [195.2.69.86])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 809E843FDF
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 11:02:45 -0800 (PST)
	(envelope-from uitm@zenon.net)
From: Andrey Alekseyev <uitm@zenon.net>
Message-Id: <200303241902.h2OJ2a252708@uitm.zenon.net>
Subject: Re: Patch to protect process from pageout killing
In-Reply-To: <200303240823.48262.wes@softweyr.com> from Wes Peters at "Mar 24,
 2003 08:23:48 am"
To: Wes Peters <wes@softweyr.com>
Date: Mon, 24 Mar 2003 22:02:36 +0300 (MSK)
Cc: freebsd-arch@freebsd.org
X-Mailer: ELM [version 2.4ME+ PL61 (25)]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-3.3 required=5.0
	tests=IN_REP_TO
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

> As promised, here's the patch to protect a process from being killed when 
> pageout is in memory shortage.  This allows a process to specify that it 

Just in case, anyone gets interested, here is another one I made about
two years ago for our own needs (mass web-hosting, etc.)
http://www.blackflag.ru/patches/vm_pageout.c.diff

:)

Allows to specify "safe" process(es) names in a sysctl variable. Doesn't
touch root processes (that's what I needed as well) and sends SIGKILL
if process is not willing to terminate.

I recall Matt Dillon had some very useful comments about the possibility
of further development of such features. Like some preferences of
what processes to kill first (some other criteria in addition to the
process size).


-- 
Andrey Alekseyev. Zenon N.S.P.
Senior Unix systems administrator

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 11: 9:12 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C10A537B408
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 11:09:05 -0800 (PST)
Received: from mail.speakeasy.net (mail17.speakeasy.net [216.254.0.217])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 27BA743FB1
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 11:09:05 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: (qmail 19059 invoked from network); 24 Mar 2003 19:09:10 -0000
Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63])
          (envelope-sender <jhb@FreeBSD.org>)
          by mail17.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP
          for <freebsd-arch@freebsd.org>; 24 Mar 2003 19:09:10 -0000
Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1])
	by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h2OJ92Ov093096;
	Mon, 24 Mar 2003 14:09:02 -0500 (EST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.20030324140902.jhb@FreeBSD.org>
X-Mailer: XFMail 1.5.4 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200303240823.48262.wes@softweyr.com>
Date: Mon, 24 Mar 2003 14:09:02 -0500 (EST)
From: John Baldwin <jhb@FreeBSD.org>
To: Wes Peters <wes@softweyr.com>
Subject: RE: Patch to protect process from pageout killing
Cc: freebsd-arch@freebsd.org
X-Spam-Status: No, hits=-19.5 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


On 24-Mar-2003 Wes Peters wrote:
> As promised, here's the patch to protect a process from being killed when 
> pageout is in memory shortage.  This allows a process to specify that it 
> is important enough to be skipped when pageout is looking for the largest 
> process to kill.
> 
> My needs are simple.  We make a box that is a web proxy and runs from a 
> memory disk, using flash for permanent storage.  The flash is mounted 
> only when a configuration write is needed, the box runs from the memory 
> disk.  We've experienced a problem at certain customer sites where bind 
> will consume a lot (~30 MB) of ram and then pageout will kill the largest 
> process, which is usually either named or squid.  This pretty much kills 
> the box.  We'd much rather have pageout kill off some of the squid worker 
> processes, we can recover from that.
> 
> Is this a good approach to the problem?  Feedback welcome.

I think that adopting the SIGDANGER approach would be better rather
than rolling our own private interface.

> @@ -625,6 +625,15 @@
>               if (limp->rlim_max < 1)
>                       limp->rlim_max = 1;
>               break;
> +
> +        case RLIMIT_PROTECT:
> +                mtx_lock_spin(&sched_lock);
> +                if (limp->rlim_cur)
> +                        p->p_flag |= P_PROTECTED;
> +                else
> +                        p->p_flag &= ~P_PROTECTED;
> +                mtx_unlock_spin(&sched_lock);
> +                break;

p_flag is protected by PROC_LOCK, not sched_lock.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 13:35:29 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0313E37B404
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 13:35:26 -0800 (PST)
Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1E10143F75
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 13:35:25 -0800 (PST)
	(envelope-from dan@dan.emsphone.com)
Received: (from dan@localhost)
	by dan.emsphone.com (8.12.7/8.12.7) id h2OLZKPB045421;
	Mon, 24 Mar 2003 15:35:20 -0600 (CST)
	(envelope-from dan)
Date: Mon, 24 Mar 2003 15:35:20 -0600
From: Dan Nelson <dnelson@allantgroup.com>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030324213519.GA63147@dan.emsphone.com>
References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <7019.1048523782@critter.freebsd.dk>
X-OS: FreeBSD 5.0-CURRENT
X-message-flag: Outlook Error
User-Agent: Mutt/1.5.4i
X-Spam-Status: No, hits=-26.0 required=5.0
	tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

In the last episode (Mar 24), Poul-Henning Kamp said:
> In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes:
> > As promised, here's the patch to protect a process from being
> > killed when pageout is in memory shortage.  This allows a process
> > to specify that it is important enough to be skipped when pageout
> > is looking for the largest process to kill.
> >
> > My needs are simple.  We make a box that is a web proxy and runs
> > from a memory disk, using flash for permanent storage.  The flash
> > is mounted only when a configuration write is needed, the box runs
> > from the memory disk.  We've experienced a problem at certain
> > customer sites where bind will consume a lot (~30 MB) of ram and
> > then pageout will kill the largest process, which is usually either
> > named or squid.  This pretty much kills the box.  We'd much rather
> > have pageout kill off some of the squid worker processes, we can
> > recover from that.
> >
> > Is this a good approach to the problem?  Feedback welcome.
> 
> I can certainly see the point, but I'm not sure this is the way.
> 
> I am not sure that we want to use the resource limits facility for
> booleans, some of the logic sourounding the suser checks may not hold
> tight.

How about changing the kill logic to look at RLIMIT_RSS?  The process
exceeding its limit by the largest amount gets killed.  That way you
can exempt certain processes by raising their limit.  Set named's limit
to say 10MB, and when memory gets tight the system will see it's
exceeding its quota by 20MB and kill it first.

-- 
	Dan Nelson
	dnelson@allantgroup.com

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 17:23:13 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id EAF6937B404
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 17:23:05 -0800 (PST)
Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131])
	by mx1.FreeBSD.org (Postfix) with ESMTP id ED38A43FAF
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 17:23:04 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: from HAL9000.homeunix.com (localhost [127.0.0.1])
	by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2P1N4ah004584;
	Mon, 24 Mar 2003 17:23:04 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: (from das@localhost)
	by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2P1N3Ua004583;
	Mon, 24 Mar 2003 17:23:03 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Date: Mon, 24 Mar 2003 17:23:03 -0800
From: David Schultz <das@FreeBSD.ORG>
To: Wes Peters <wes@softweyr.com>
Cc: freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325012303.GA4406@HAL9000.homeunix.com>
Mail-Followup-To: Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.ORG
References: <200303240823.48262.wes@softweyr.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <200303240823.48262.wes@softweyr.com>
X-Spam-Status: No, hits=-19.6 required=5.0
	tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Thus spake Wes Peters <wes@softweyr.com>:
> As promised, here's the patch to protect a process from being killed when 
> pageout is in memory shortage.  This allows a process to specify that it 
> is important enough to be skipped when pageout is looking for the largest 
> process to kill.
> 
> My needs are simple.  We make a box that is a web proxy and runs from a 
> memory disk, using flash for permanent storage.  The flash is mounted 
> only when a configuration write is needed, the box runs from the memory 
> disk.  We've experienced a problem at certain customer sites where bind 
> will consume a lot (~30 MB) of ram and then pageout will kill the largest 
> process, which is usually either named or squid.  This pretty much kills 
> the box.  We'd much rather have pageout kill off some of the squid worker 
> processes, we can recover from that.

Very nice.  Inheritance of this attribute seems to be a
contentious issue.  Making inheritance tunable might be a good
idea.  You wouldn't be able to piggyback on rlimit, though.

There's a significant userland component of this as well, although
that's probably a job for another day.  It basically consists of
making it possible to specify that certain standard system daemons
should have this attribute.

> +			    (p->p_flag & P_PROTECTED) ||
>  			    ((p->p_pid < 48) && (vm_swap_size != 0))) {
>  				PROC_UNLOCK(p);
>  				continue;

The pid < 48 magic can probably go away, while you're at it.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 17:28:52 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 58EA037B401
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 17:28:50 -0800 (PST)
Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 92F3643FA3
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 17:28:49 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: from HAL9000.homeunix.com (localhost [127.0.0.1])
	by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2P1Siah004601;
	Mon, 24 Mar 2003 17:28:44 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: (from das@localhost)
	by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2P1SiZ1004600;
	Mon, 24 Mar 2003 17:28:44 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Date: Mon, 24 Mar 2003 17:28:44 -0800
From: David Schultz <das@FreeBSD.ORG>
To: Dan Nelson <dnelson@allantgroup.com>
Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325012844.GB4406@HAL9000.homeunix.com>
Mail-Followup-To: Dan Nelson <dnelson@allantgroup.com>,
	Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030324213519.GA63147@dan.emsphone.com>
X-Spam-Status: No, hits=-19.6 required=5.0
	tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Thus spake Dan Nelson <dnelson@allantgroup.com>:
> In the last episode (Mar 24), Poul-Henning Kamp said:
> > In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes:
> > > As promised, here's the patch to protect a process from being
> > > killed when pageout is in memory shortage.  This allows a process
> > > to specify that it is important enough to be skipped when pageout
> > > is looking for the largest process to kill.
> > >
> > > My needs are simple.  We make a box that is a web proxy and runs
> > > from a memory disk, using flash for permanent storage.  The flash
> > > is mounted only when a configuration write is needed, the box runs
> > > from the memory disk.  We've experienced a problem at certain
> > > customer sites where bind will consume a lot (~30 MB) of ram and
> > > then pageout will kill the largest process, which is usually either
> > > named or squid.  This pretty much kills the box.  We'd much rather
> > > have pageout kill off some of the squid worker processes, we can
> > > recover from that.
> > >
> > > Is this a good approach to the problem?  Feedback welcome.
> > 
> > I can certainly see the point, but I'm not sure this is the way.
> > 
> > I am not sure that we want to use the resource limits facility for
> > booleans, some of the logic sourounding the suser checks may not hold
> > tight.
> 
> How about changing the kill logic to look at RLIMIT_RSS?  The process
> exceeding its limit by the largest amount gets killed.  That way you
> can exempt certain processes by raising their limit.  Set named's limit
> to say 10MB, and when memory gets tight the system will see it's
> exceeding its quota by 20MB and kill it first.

I think that's overengineering the problem.  First of all, it
means that on any system where RLIMIT_RSS is unlimited by default,
the machine now deadlocks when it runs out of memory.  Second, it
is only marginally useful to go as far as specifying priorities
and quotas and such on process killability.  Most of the time,
people can divide the processes on thier system into two
categories: critical and killable.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 17:52:47 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 6B15137B404; Mon, 24 Mar 2003 17:52:44 -0800 (PST)
Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 356EB43F75; Mon, 24 Mar 2003 17:52:41 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Mon, 24 Mar 2003 17:52:40 -0800
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr.com
To: John Baldwin <jhb@FreeBSD.org>
Subject: Re: Patch to protect process from pageout killing
Date: Mon, 24 Mar 2003 17:52:40 -0800
User-Agent: KMail/1.5
Cc: freebsd-arch@freebsd.org
References: <XFMail.20030324140902.jhb@FreeBSD.org>
In-Reply-To: <XFMail.20030324140902.jhb@FreeBSD.org>
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to <http://www.habeas.com/report/>.   
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303241752.40245.wes@softweyr.com>
X-OriginalArrivalTime: 25 Mar 2003 01:52:40.0618 (UTC) FILETIME=[3818F0A0:01C2F271]
X-Spam-Status: No, hits=-26.0 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Monday 24 March 2003 11:09, John Baldwin wrote:
> On 24-Mar-2003 Wes Peters wrote:
> > As promised, here's the patch to protect a process from being
> > killed when pageout is in memory shortage.  This allows a process
> > to specify that it is important enough to be skipped when pageout
> > is looking for the largest process to kill.
> >
> > My needs are simple.  We make a box that is a web proxy and runs
> > from a memory disk, using flash for permanent storage.  The flash
> > is mounted only when a configuration write is needed, the box runs
> > from the memory disk.  We've experienced a problem at certain
> > customer sites where bind will consume a lot (~30 MB) of ram and
> > then pageout will kill the largest process, which is usually either
> > named or squid.  This pretty much kills the box.  We'd much rather
> > have pageout kill off some of the squid worker processes, we can
> > recover from that.
> >
> > Is this a good approach to the problem?  Feedback welcome.
>
> I think that adopting the SIGDANGER approach would be better rather
> than rolling our own private interface.

It's not clear to me the SIGDANGER interface allows me to say "go 
elsewhere bub, I'm really important."  In this case, that is essential.  
I think even in the general FreeBSD case you can make a point for a 
setting like this in, say, named.

The SIGDANGER interface worries me in general, partly because it's a 
signal and partly because it complicates the design of EVERYTHING just 
to handle it.  I guess a lot depends on the implementation details of 
how SIGDANGER and the default handlers are designed, but nothing I saw 
last week gave me a warm fuzzy about that.

> > @@ -625,6 +625,15 @@
> >               if (limp->rlim_max < 1)
> >                       limp->rlim_max = 1;
> >               break;
> > +
> > +        case RLIMIT_PROTECT:
> > +                mtx_lock_spin(&sched_lock);
> > +                if (limp->rlim_cur)
> > +                        p->p_flag |= P_PROTECTED;
> > +                else
> > +                        p->p_flag &= ~P_PROTECTED;
> > +                mtx_unlock_spin(&sched_lock);
> > +                break;
>
> p_flag is protected by PROC_LOCK, not sched_lock.

Gurk!  Will fix.

-- 
         "Where am I, and what am I doing in this handbasket?"

Wes Peters                                              wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 17:59:42 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6556537B401
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 17:59:39 -0800 (PST)
Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E442E43F85
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 17:59:37 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Mon, 24 Mar 2003 17:59:36 -0800
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr.com
To: Dan Nelson <dnelson@allantgroup.com>,
	Poul-Henning Kamp <phk@phk.freebsd.dk>
Subject: Re: Patch to protect process from pageout killing
Date: Mon, 24 Mar 2003 17:59:36 -0800
User-Agent: KMail/1.5
Cc: freebsd-arch@FreeBSD.ORG
References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com>
In-Reply-To: <20030324213519.GA63147@dan.emsphone.com>
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to <http://www.habeas.com/report/>.   
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303241759.36410.wes@softweyr.com>
X-OriginalArrivalTime: 25 Mar 2003 01:59:36.0680 (UTC) FILETIME=[3016F680:01C2F272]
X-Spam-Status: No, hits=-25.7 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Monday 24 March 2003 13:35, Dan Nelson wrote:
> In the last episode (Mar 24), Poul-Henning Kamp said:
> > In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes:
> > > As promised, here's the patch to protect a process from being
> > > killed when pageout is in memory shortage.  This allows a process
> > > to specify that it is important enough to be skipped when pageout
> > > is looking for the largest process to kill.
> > >
> > > My needs are simple.  We make a box that is a web proxy and runs
> > > from a memory disk, using flash for permanent storage.  The flash
> > > is mounted only when a configuration write is needed, the box
> > > runs from the memory disk.  We've experienced a problem at
> > > certain customer sites where bind will consume a lot (~30 MB) of
> > > ram and then pageout will kill the largest process, which is
> > > usually either named or squid.  This pretty much kills the box. 
> > > We'd much rather have pageout kill off some of the squid worker
> > > processes, we can recover from that.
> > >
> > > Is this a good approach to the problem?  Feedback welcome.
> >
> > I can certainly see the point, but I'm not sure this is the way.
> >
> > I am not sure that we want to use the resource limits facility for
> > booleans, some of the logic sourounding the suser checks may not
> > hold tight.
>
> How about changing the kill logic to look at RLIMIT_RSS?  The process
> exceeding its limit by the largest amount gets killed.  That way you
> can exempt certain processes by raising their limit.  Set named's
> limit to say 10MB, and when memory gets tight the system will see
> it's exceeding its quota by 20MB and kill it first.

Mostly because it's not possible to predict what named's RSS will be in 
any particular customer installation.  The ones that raised this issue 
were at 32MB and stable, and took about 9 days to get there.  We don't 
want named (or squid) to die under ANY circumstances; if the box can't 
run both named and squid it's effectively a brick.  On the other hand, 
we have lots (hundreds) of other smaller processes running, any one of 
which is expendable and can be recovered from.  

Yeah, better ability to adapt to the (memory) load would be perhaps a 
better way to do this, but I really hate the idea of dumping named on 
it's head and restarting 9 days of learning just because we're getting 
hammered by people checking the weather and traffic before heading 
home.

-- 
         "Where am I, and what am I doing in this handbasket?"

Wes Peters                                              wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 18: 5:44 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 75D1037B404
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 18:05:41 -0800 (PST)
Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 00AFA43F93
	for <freebsd-arch@freebsd.org>; Mon, 24 Mar 2003 18:05:40 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Mon, 24 Mar 2003 18:05:38 -0800
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr.com
To: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
Subject: Re: Patch to protect process from pageout killing
Date: Mon, 24 Mar 2003 18:05:38 -0800
User-Agent: KMail/1.5
Cc: freebsd-arch@freebsd.org
References: <7019.1048523782@critter.freebsd.dk>
In-Reply-To: <7019.1048523782@critter.freebsd.dk>
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to <http://www.habeas.com/report/>.   
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303241805.38175.wes@softweyr.com>
X-OriginalArrivalTime: 25 Mar 2003 02:05:38.0430 (UTC) FILETIME=[07B5A1E0:01C2F273]
X-Spam-Status: No, hits=-25.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote:
> In message <200303240823.48262.wes@softweyr.com>, Wes Peters writes:
> >As promised, here's the patch to protect a process from being killed
> > when pageout is in memory shortage.  This allows a process to
> > specify that it is important enough to be skipped when pageout is
> > looking for the largest process to kill.
> >
> >My needs are simple.  We make a box that is a web proxy and runs
> > from a memory disk, using flash for permanent storage.  The flash
> > is mounted only when a configuration write is needed, the box runs
> > from the memory disk.  We've experienced a problem at certain
> > customer sites where bind will consume a lot (~30 MB) of ram and
> > then pageout will kill the largest process, which is usually either
> > named or squid.  This pretty much kills the box.  We'd much rather
> > have pageout kill off some of the squid worker processes, we can
> > recover from that.
> >
> >Is this a good approach to the problem?  Feedback welcome.
>
> (Ignoring the white-space change)

OK, I put them back so the function will be inconsistent again. ;^)

They accidentally got shuffled when I move my implementation from just 
below RLIMIT_CPU (from which it obviously and erroneously heavily 
borrowed) to put it in numerical order.

> I can certainly see the point, but I'm not sure this is the way.
>
> I am not sure that we want to use the resource limits facility for
> booleans, some of the logic sourounding the suser checks may not
> hold tight.

I had concerns about that as well.  In the original (4.4 roughly) 
implementation I used madvise as the interface, but the madvise 
interface has changed greatly.  It didn't seem worthwhile adding a 
syscall for this task, so I looked for another reasonable protected 
interface to ab(use).  I'm 100% open to suggestions on the API.

> Also, doesn't this result in the flag being inerited with fork() and
> thereby negating the effect you are seeking for squid ?

I looked through all the places in kern_fork.c where p2->p_flag gets set 
and didn't see anything that looked like it would inherit P_PROTECTED 
from p1->p_flag.  Did I miss something?  I'm obviously a bit of a 
neophyte in this part of the kernel.

-- 
         "Where am I, and what am I doing in this handbasket?"

Wes Peters                                              wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 18:12:59 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 896A337B401; Mon, 24 Mar 2003 18:12:56 -0800 (PST)
Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id D548543F3F; Mon, 24 Mar 2003 18:12:55 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from salty.rapid.stbernard.com ([192.168.4.61]) by mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329);
	 Mon, 24 Mar 2003 18:12:55 -0800
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr.com
To: David Schultz <das@FreeBSD.ORG>
Subject: Re: Patch to protect process from pageout killing
Date: Mon, 24 Mar 2003 18:12:55 -0800
User-Agent: KMail/1.5
Cc: freebsd-arch@FreeBSD.ORG
References: <200303240823.48262.wes@softweyr.com> <20030325012303.GA4406@HAL9000.homeunix.com>
In-Reply-To: <20030325012303.GA4406@HAL9000.homeunix.com>
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to <http://www.habeas.com/report/>.   
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303241812.55290.wes@softweyr.com>
X-OriginalArrivalTime: 25 Mar 2003 02:12:55.0571 (UTC) FILETIME=[0C440E30:01C2F274]
X-Spam-Status: No, hits=-25.8 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Monday 24 March 2003 17:23, David Schultz wrote:
> Thus spake Wes Peters <wes@softweyr.com>:
> > As promised, here's the patch to protect a process from being
> > killed when pageout is in memory shortage.  This allows a process
> > to specify that it is important enough to be skipped when pageout
> > is looking for the largest process to kill.
> >
> > My needs are simple.  We make a box that is a web proxy and runs
> > from a memory disk, using flash for permanent storage.  The flash
> > is mounted only when a configuration write is needed, the box runs
> > from the memory disk.  We've experienced a problem at certain
> > customer sites where bind will consume a lot (~30 MB) of ram and
> > then pageout will kill the largest process, which is usually either
> > named or squid.  This pretty much kills the box.  We'd much rather
> > have pageout kill off some of the squid worker processes, we can
> > recover from that.
>
> Very nice.  Inheritance of this attribute seems to be a
> contentious issue.  Making inheritance tunable might be a good
> idea.  You wouldn't be able to piggyback on rlimit, though.

Actually inheritance was unintentional, I'm waiting for feedback on what 
I should've done to make it not inherited.  Any help you can offer will 
be appreciated.

> There's a significant userland component of this as well, although
> that's probably a job for another day.  It basically consists of
> making it possible to specify that certain standard system daemons
> should have this attribute.

Yup.

> > +			    (p->p_flag & P_PROTECTED) ||
> >  			    ((p->p_pid < 48) && (vm_swap_size != 0))) {
> >  				PROC_UNLOCK(p);
> >  				continue;
>
> The pid < 48 magic can probably go away, while you're at it.

I'd be happy to -- that 48 makes me nervous -- if a couple more Really 
Smart(tm) guys say it's OK. ;^)

-- 
         "Where am I, and what am I doing in this handbasket?"

Wes Peters                                              wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 19:24:26 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E097E37B43A
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 19:24:21 -0800 (PST)
Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 31CF343FB1
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 19:24:21 -0800 (PST)
	(envelope-from dan@dan.emsphone.com)
Received: (from dan@localhost)
	by dan.emsphone.com (8.12.7/8.12.7) id h2P3OKhi096231;
	Mon, 24 Mar 2003 21:24:20 -0600 (CST)
	(envelope-from dan)
Date: Mon, 24 Mar 2003 21:24:20 -0600
From: Dan Nelson <dnelson@allantgroup.com>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325032420.GA22424@dan.emsphone.com>
References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030325012844.GB4406@HAL9000.homeunix.com>
X-OS: FreeBSD 5.0-CURRENT
X-message-flag: Outlook Error
User-Agent: Mutt/1.5.4i
X-Spam-Status: No, hits=-26.0 required=5.0
	tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

In the last episode (Mar 24), David Schultz said:
> Thus spake Dan Nelson <dnelson@allantgroup.com>:
> > How about changing the kill logic to look at RLIMIT_RSS?  The
> > process exceeding its limit by the largest amount gets killed. 
> > That way you can exempt certain processes by raising their limit. 
> > Set named's limit to say 10MB, and when memory gets tight the
> > system will see it's exceeding its quota by 20MB and kill it first.
> 
> I think that's overengineering the problem.  First of all, it means
> that on any system where RLIMIT_RSS is unlimited by default, the
> machine now deadlocks when it runs out of memory.  Second, it is only
> marginally useful to go as far as specifying priorities and quotas
> and such on process killability.  Most of the time, people can divide
> the processes on thier system into two categories: critical and
> killable.

RSS overcommit would be the first sort priority.  If nothing is over
its limit, you fall back on the old "biggest process dies" rule.  Set
the critical processes at an infinite RSS, set the killable ones at a
reasonable RSS, set your cannon fodder processes at 0.  The default RSS
is infinity so you get classic behaviour.

In the embedded server case, there's no swap space so the RSS limit
isn't even being used anyway.  There is still the inheritance problem,
though, so a RSS=inf daemon would have to manually set the rlimit to 0
after forking a killable process.

-- 
	Dan Nelson
	dnelson@allantgroup.com

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 20:22:56 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id B9B0D37B404; Mon, 24 Mar 2003 20:22:52 -0800 (PST)
Received: from smtp1.server.rpi.edu (smtp1.server.rpi.edu [128.113.2.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 74C7D43F85; Mon, 24 Mar 2003 20:22:49 -0800 (PST)
	(envelope-from drosih@rpi.edu)
Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47])
	by smtp1.server.rpi.edu (8.12.8/8.12.7) with ESMTP id h2P4MiBg001437;
	Mon, 24 Mar 2003 23:22:44 -0500
Mime-Version: 1.0
X-Sender: drosih@mail.rpi.edu
Message-Id: <p05200f42baa5897e8dd8@[128.113.24.47]>
In-Reply-To: <20030325012844.GB4406@HAL9000.homeunix.com>
References: <200303240823.48262.wes@softweyr.com>
 <7019.1048523782@critter.freebsd.dk>
 <20030324213519.GA63147@dan.emsphone.com>
 <20030325012844.GB4406@HAL9000.homeunix.com>
Date: Mon, 24 Mar 2003 23:22:43 -0500
To: David Schultz <das@FreeBSD.ORG>,
	Dan Nelson <dnelson@allantgroup.com>
From: Garance A Drosihn <drosih@rpi.edu>
Subject: Re: Patch to protect process from pageout killing
Cc: Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
X-Scanned-By: MIMEDefang 2.28
X-Spam-Status: No, hits=-24.5 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

At 5:28 PM -0800 3/24/03, David Schultz wrote:
>  Second, it is only marginally useful to go as far as specifying
>priorities and quotas and such on process killability.  Most of
>the time, people can divide the processes on thier system into
>two categories: critical and killable.

While that's probably true "most of the time", I think we'd want to
encourage three categories.  critical, less-critical (killable),
and kill-me-first.  That's what SIGDANGER provides, and in some
situations that third category is very desirable.

-- 
Garance Alistair Drosehn            =   gad@gilead.netel.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 20:53:36 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 55CF337B401; Mon, 24 Mar 2003 20:53:33 -0800 (PST)
Received: from smtp2.server.rpi.edu (smtp2.server.rpi.edu [128.113.2.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 05DF843F3F; Mon, 24 Mar 2003 20:53:32 -0800 (PST)
	(envelope-from drosih@rpi.edu)
Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47])
	by smtp2.server.rpi.edu (8.12.8/8.12.7) with ESMTP id h2P4rUn6014574;
	Mon, 24 Mar 2003 23:53:30 -0500
Mime-Version: 1.0
X-Sender: drosih@mail.rpi.edu
Message-Id: <p05200f43baa58a1eb364@[128.113.24.47]>
In-Reply-To: <200303241752.40245.wes@softweyr.com>
References: <XFMail.20030324140902.jhb@FreeBSD.org>
 <200303241752.40245.wes@softweyr.com>
Date: Mon, 24 Mar 2003 23:53:29 -0500
To: Wes Peters <wes@softweyr.com>, John Baldwin <jhb@FreeBSD.ORG>
From: Garance A Drosihn <drosih@rpi.edu>
Subject: Re: Patch to protect process from pageout killing
Cc: freebsd-arch@FreeBSD.ORG
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
X-Scanned-By: MIMEDefang 2.28
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

At 5:52 PM -0800 3/24/03, Wes Peters wrote:
>On Monday 24 March 2003 11:09, John Baldwin wrote:
>  > I think that adopting the SIGDANGER approach would be better
>  > rather than rolling our own private interface.
>
>It's not clear to me the SIGDANGER interface allows me to say
>"go elsewhere bub, I'm really important."  In this case, that
>is essential.  I think even in the general FreeBSD case you can
>make a point for a setting like this in, say, named.

Please check out the descriptions I posted previously.  SIGDANGER
(as implemented by AIX) explicitly provides two things.  The process
gets to decide which one they (the process) wants:

    1) signal me at the first sign of trouble, and I'll free
       up some virtual memory (possibly by exit()-ing).
    2) do not ever kill me to free up memory.

I think that we could improve upon the AIX implementation if we
wanted to, but I think people are so used to having problems with
AIX that they hate the idea of SIGDANGER as soon as they see the
letters AIX.  Having used AIX for more than ten years now, I can
sympathize with that, but in the specific case of SIGDANGER there
is an idea that can work quite well.

(reference on sigdanger was at:
http://nscp.upenn.edu/aix4.3html/aixbman/baseadmn/pag_space_under.htm
)

>The SIGDANGER interface worries me in general, partly because it's
>a signal and partly because it complicates the design of EVERYTHING
>just  to handle it.  I guess a lot depends on the implementation
>details of how SIGDANGER and the default handlers are designed,
>but nothing I saw  last week gave me a warm fuzzy about that.

I don't know enough about the lower-level implementation details,
but I did think the recent discussion on the src-committers list
did include a number of good ideas.  I am horribly over-committed
with things that I've promised to do (including stuff for my real-
world job...), so I can't look into SIGDANGER ideas right now, but
I'm more than happy to try to explain how it should work.

-- 
Garance Alistair Drosehn            =   gad@gilead.netel.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 22:10:44 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id C1ABE37B401; Mon, 24 Mar 2003 22:10:38 -0800 (PST)
Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 19FAC43F75; Mon, 24 Mar 2003 22:10:38 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com [66.91.236.204])
	by smtp-relay.omnis.com (Postfix) with ESMTP
	id D929F4336E; Mon, 24 Mar 2003 22:10:35 -0800 (PST)
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr
To: Garance A Drosihn <drosih@rpi.edu>,
	John Baldwin <jhb@FreeBSD.ORG>
Subject: Re: Patch to protect process from pageout killing
Date: Mon, 24 Mar 2003 22:10:32 -0800
User-Agent: KMail/1.5
Cc: freebsd-arch@FreeBSD.ORG
References: <XFMail.20030324140902.jhb@FreeBSD.org> <200303241752.40245.wes@softweyr.com> <p05200f43baa58a1eb364@[128.113.24.47]>
In-Reply-To: <p05200f43baa58a1eb364@[128.113.24.47]>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303242210.32055.wes@softweyr.com>
X-Spam-Status: No, hits=-16.1 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Monday 24 March 2003 20:53, Garance A Drosihn wrote:
> At 5:52 PM -0800 3/24/03, Wes Peters wrote:
> >
> >It's not clear to me the SIGDANGER interface allows me to say
> >"go elsewhere bub, I'm really important."  In this case, that
> >is essential.  I think even in the general FreeBSD case you can
> >make a point for a setting like this in, say, named.
>
> Please check out the descriptions I posted previously.  SIGDANGER
> (as implemented by AIX) explicitly provides two things.  The process
> gets to decide which one they (the process) wants:
>
>     1) signal me at the first sign of trouble, and I'll free
>        up some virtual memory (possibly by exit()-ing).
>     2) do not ever kill me to free up memory.

The current situation, leave me alone until you're really hurting, then 
just kill me quickly, should not only be an option but be the default.  
Is that covered?  As the default?  I.e. if I don't specify any handling 
of SIGDANGER at all, does it continue to work as now?

I guess my biggest worry about SIGDANGER is that minds much brighter than 
yours or mine share my worries about it.  Relying on signal delivery is 
just not in my nature.

> I think that we could improve upon the AIX implementation if we
> wanted to, but I think people are so used to having problems with
> AIX that they hate the idea of SIGDANGER as soon as they see the
> letters AIX. 

Yeah, that's pretty much my knee-jerk reaction.  I haven't really used AIX 
since about 3.2.5, but it was just fugly in those days.

> Having used AIX for more than ten years now, I can
> sympathize with that, but in the specific case of SIGDANGER there
> is an idea that can work quite well.
>
> (reference on sigdanger was at:
> http://nscp.upenn.edu/aix4.3html/aixbman/baseadmn/pag_space_under.htm
> )
>
> >The SIGDANGER interface worries me in general, partly because it's
> >a signal and partly because it complicates the design of EVERYTHING
> >just  to handle it.  I guess a lot depends on the implementation
> >details of how SIGDANGER and the default handlers are designed,
> >but nothing I saw  last week gave me a warm fuzzy about that.
>
> I don't know enough about the lower-level implementation details,
> but I did think the recent discussion on the src-committers list
> did include a number of good ideas.  I am horribly over-committed
> with things that I've promised to do (including stuff for my real-
> world job...), so I can't look into SIGDANGER ideas right now, but
> I'm more than happy to try to explain how it should work.

Some of the explanations were reasonable enough to erase all of my 
objections EXCEPT "it's a signal."  Do we have signal delivery to 
multi-threaded processes worked out enough to rely on SIGDANGER for such 
a critical function?  If so, it's news to me, but that doesn't mean it's 
not done...

-- 

        Where am I, and what am I doing in this handbasket?

Wes Peters                                               wes@softweyr.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 22:53:10 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E81AF37B401; Mon, 24 Mar 2003 22:53:04 -0800 (PST)
Received: from smtp3.server.rpi.edu (smtp3.server.rpi.edu [128.113.2.3])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 0D72E43F93; Mon, 24 Mar 2003 22:53:04 -0800 (PST)
	(envelope-from drosih@rpi.edu)
Received: from [128.113.24.47] (gilead.netel.rpi.edu [128.113.24.47])
	by smtp3.server.rpi.edu (8.12.8/8.12.7) with ESMTP id h2P6r2QA023930;
	Tue, 25 Mar 2003 01:53:02 -0500
Mime-Version: 1.0
X-Sender: drosih@mail.rpi.edu
Message-Id: <p05200f44baa5a6b76746@[128.113.24.47]>
In-Reply-To: <200303242210.32055.wes@softweyr.com>
References: <XFMail.20030324140902.jhb@FreeBSD.org>
 <200303241752.40245.wes@softweyr.com>
 <p05200f43baa58a1eb364@[128.113.24.47]>
 <200303242210.32055.wes@softweyr.com>
Date: Tue, 25 Mar 2003 01:53:01 -0500
To: Wes Peters <wes@softweyr.com>, John Baldwin <jhb@FreeBSD.ORG>
From: Garance A Drosihn <drosih@rpi.edu>
Subject: Re: Patch to protect process from pageout killing
Cc: freebsd-arch@FreeBSD.ORG
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
X-Scanned-By: MIMEDefang 2.28
X-Spam-Status: No, hits=-25.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

At 10:10 PM -0800 3/24/03, Wes Peters wrote:
>On Monday 24 March 2003 20:53, Garance A Drosihn wrote:
>>  At 5:52 PM -0800 3/24/03, Wes Peters wrote:
>>  >
>>  >It's not clear to me the SIGDANGER interface allows me to say
>>  >"go elsewhere bub, I'm really important."  In this case, that
>>  >is essential.  I think even in the general FreeBSD case you can
>>  >make a point for a setting like this in, say, named.
>>
>>  Please check out the descriptions I posted previously.  SIGDANGER
>>  (as implemented by AIX) explicitly provides two things.  The process
>>  gets to decide which one they (the process) wants:
>>
>>      1) signal me at the first sign of trouble, and I'll free
>>         up some virtual memory (possibly by exit()-ing).
>>      2) do not ever kill me to free up memory.
>
>The current situation, leave me alone until you're really hurting,
>then just kill me quickly, should not only be an option but be the
>default.  Is that covered?  As the default?  I.e. if I don't
>specify any handling  of SIGDANGER at all, does it continue to
>work as now?

Yes.

>I guess my biggest worry about SIGDANGER is that minds much brighter
>than yours or mine share my worries about it.  Relying on signal
>delivery is just not in my nature.

Actually, I think the biggest complaint with SIGDANGER (as AIX
does it), is that you *must* recompile programs to add the
signal-handler, or SIGDANGER does you absolutely no good.  This
leads to the argument "what if I don't have the source to some
program that should not be killed?".  Or, for that matter, "what
if I don't have the source to a program which I know should be
among the first to die?"

This is an area where I think we could do better than the AIX
implementation, although "doing better" does imply "more work"...
I think we want to come up with something so people don't have
to go changing every program to add a signal handler, but the
decision would usually be left to the system-admin.

>  > I don't know enough about the lower-level implementation details,
>>  but I did think the recent discussion on the src-committers list
>>  did include a number of good ideas.  I am horribly over-committed
>>  with things that I've promised to do (including stuff for my real-
>>  world job...), so I can't look into SIGDANGER ideas right now, but
>>  I'm more than happy to try to explain how it should work.
>
>Some of the explanations were reasonable enough to erase all of my
>objections EXCEPT "it's a signal."  Do we have signal delivery to
>multi-threaded processes worked out enough to rely on SIGDANGER
>for such a critical function?  If so, it's news to me, but that
>doesn't mean it's not done...

Well, most signal handlers for SIGDANGER are very simple, so they
should tend to work even if signal-handling in general is iffy.
They either:

static void
ignore_danger(int signo) {
	/* Just return, thus telling the kernel "Do Not Kill Me" */
}

or

static void
we_are_really_nice(int signo) {
	/* System is running out of VM -- so we will disappear! */
	exit(1);
}

Well, those are the two kinds I have written.  I guess the second
one could be a lot more complicated.  Really it should set a flag
and then let some other main-processing-loop do the exit() call.
I don't know what that means for multi-threaded programs under
freebsd, but since you don't *have* to add a signal-handler to
every program, it might be that most system administrators will
be able to solve their low-memory issues even if signal-handling
did not work reliably for all programs.

-- 
Garance Alistair Drosehn            =   gad@gilead.netel.rpi.edu
Senior Systems Programmer           or  gad@freebsd.org
Rensselaer Polytechnic Institute    or  drosih@rpi.edu

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Mar 24 23:53:52 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3E17D37B401
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 23:53:48 -0800 (PST)
Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 75FCD43F85
	for <freebsd-arch@FreeBSD.ORG>; Mon, 24 Mar 2003 23:53:47 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: from HAL9000.homeunix.com (localhost [127.0.0.1])
	by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2P7rhah005535;
	Mon, 24 Mar 2003 23:53:43 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Received: (from das@localhost)
	by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2P7rgQl005534;
	Mon, 24 Mar 2003 23:53:42 -0800 (PST)
	(envelope-from das@FreeBSD.ORG)
Date: Mon, 24 Mar 2003 23:53:42 -0800
From: David Schultz <das@FreeBSD.ORG>
To: Garance A Drosihn <drosih@rpi.edu>
Cc: Dan Nelson <dnelson@allantgroup.com>,
	Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325075342.GA5450@HAL9000.homeunix.com>
Mail-Followup-To: Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>,
	Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com> <p05200f42baa5897e8dd8@[128.113.24.47]>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <p05200f42baa5897e8dd8@[128.113.24.47]>
X-Spam-Status: No, hits=-19.6 required=5.0
	tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Thus spake Garance A Drosihn <drosih@rpi.edu>:
> At 5:28 PM -0800 3/24/03, David Schultz wrote:
> > Second, it is only marginally useful to go as far as specifying
> >priorities and quotas and such on process killability.  Most of
> >the time, people can divide the processes on thier system into
> >two categories: critical and killable.
> 
> While that's probably true "most of the time", I think we'd want to
> encourage three categories.  critical, less-critical (killable),
> and kill-me-first.  That's what SIGDANGER provides, and in some
> situations that third category is very desirable.

Yes, I think the SIGDANGER idea makes sense.  Essentially what
you'd want is a higher threshhold above the ``red alert---start
killing things'' threshhold where you can do smart things like
send SIGDANGERs without worrying about running completely out of
memory.  But I'm trying to impress on people that SIGDANGER is
orthogonal to what Wes is trying to do, before the whole thing
gets bogged down in discussions again and nothing ever happens.
Here's an example of what I mean in verbose pseudocode with
fudged constants:

	if (free VM < 64 pages) {
		/* This is the part Wes is working on. */
		kill big processes EXCEPT the ones that
		    are so important that there's no point
		    in running the system without them;
	} else if (free VM < 256 pages) {
		/*
		 * It takes additional memory to do this, but we're
		 * hoping some processes will cooperate and the
		 * shortage will go away.
		 */
		start warning processes with SIGDANGER;
	}

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  0:26:10 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 1306037B401; Tue, 25 Mar 2003 00:26:06 -0800 (PST)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 401E843FA3; Tue, 25 Mar 2003 00:26:05 -0800 (PST)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2P8PrhV014383;
	Tue, 25 Mar 2003 09:25:53 +0100 (CET)
	(envelope-from phk@phk.freebsd.dk)
To: David Schultz <das@FreeBSD.ORG>
Cc: Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>, Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing 
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Mon, 24 Mar 2003 23:53:42 PST."
             <20030325075342.GA5450@HAL9000.homeunix.com> 
Date: Tue, 25 Mar 2003 09:25:53 +0100
Message-ID: <14382.1048580753@critter.freebsd.dk>
X-Spam-Status: No, hits=-6.5 required=5.0
	tests=AWL,IN_REP_TO
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


>But I'm trying to impress on people that SIGDANGER is
>orthogonal to what Wes is trying to do, before the whole thing
>gets bogged down in discussions again and nothing ever happens.
>Here's an example of what I mean in verbose pseudocode with
>fudged constants:

If we are going to do this, we should do it right.

Doing it right means that we should also be sharing enough information
with userland, so that userland can adapt.

Take a simple example:  It makes sense for a program like fsck to
use all the RAM it can get hold off as cache, but it does not make
sense for the cache to be paged out.

As I see it, there is a need for several mechanisms:

1. A mechanism to export to userland enough information about the
   current RAM availability, so that phkmalloc and application
   specific code can make intelligent choices before things go bad.

2. A mechanism to alert userland to the fact that things _have_ gone
   bad.

3. A mechanism to influence the "Who do we kill ?" decision once
   things have gone from bad to worse.

To tackle them from behind:

Wes has a proposal for #3 which is a per-process flag which says
"I'm sacred".  I think that is a sound principle since that is
usually exactly what people want:  Do Not Kill This Process.

Certain processes already enjoy special protection, pid==1 most
notably, this would just be a way to make the same protection
available to other processes.  I'm not happy about using the
resourcelimit code for booleans, and I don't think the flag
should be inherited, but otherwise I'm for the idea.

We have the SIGDANGER proposal for #2, but I think we need to have
two severities:  "Out of RAM" and "Out of VM".  A program like
fsck would start to recycle cached sectors once we're out of RAM.

But I have not seen anybody come up with a good proposal for
#1, and that is where the main benefit would be derived:  It would
allow processes to be good citizens and adjust to the present
situation.

Traditionally userland code is totally oblivious to the overall
system circumstances, the most notable exception is sendmail which
for ages have monitored the loadavg and backe off accordingly.

I think all daemons, and even some non-daemon programs, can benefit
from being aware of more of the systems situation:

	phkmalloc would automatically shed the cache and go into
	"hinting" mode if there were any pageing activity.

	Daemons like named can shed caches.

	Long running daemons could even go through a garbage collect
	to reduce their memory footprint (using realloc() to reduce
	fragmentation).

	Bgfsck can shed all cache and take a nap.

	Sort can use smaller buckets.

The signals in #2 could be used as a cheap substitute for this, but
we would need to add complementary "All Clear" signals to get
processes out of "contingency mode".

I have often wondered about making a single page of "kernel info"
which would be read-only mapped into all processes, (my main agenda
is really evil timekeeping), but it would also be the perfect place
for information like:

	"N free pages in system"
	"N pages of swap used"
	"N pages paged out during the last 1/15/60 seconds"
	"N pages paged in during the last 1/15/60 seconds"
	...
And with cheap access to that information, processes could much
easier taylor their behaviour.


-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  0:43: 4 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 7A82837B401; Tue, 25 Mar 2003 00:43:01 -0800 (PST)
Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 0272543F3F; Tue, 25 Mar 2003 00:43:00 -0800 (PST)
	(envelope-from marcel@xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201])
	by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2P8gmKu015264;
	Tue, 25 Mar 2003 00:42:48 -0800 (PST)
	(envelope-from marcel@piii.pn.xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1])
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2P8glvn017322;
	Tue, 25 Mar 2003 00:42:47 -0800 (PST)
	(envelope-from marcel@dhcp01.pn.xcllnt.net)
Received: (from marcel@localhost)
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2P8glDU017321;
	Tue, 25 Mar 2003 00:42:47 -0800 (PST)
Date: Tue, 25 Mar 2003 00:42:47 -0800
From: Marcel Moolenaar <marcel@xcllnt.net>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: David Schultz <das@FreeBSD.ORG>,
	Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>, Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325084247.GA17195@dhcp01.pn.xcllnt.net>
References: <20030325075342.GA5450@HAL9000.homeunix.com> <14382.1048580753@critter.freebsd.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <14382.1048580753@critter.freebsd.dk>
User-Agent: Mutt/1.5.3i
X-Spam-Status: No, hits=-30.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Tue, Mar 25, 2003 at 09:25:53AM +0100, Poul-Henning Kamp wrote:
> 
> 3. A mechanism to influence the "Who do we kill ?" decision once
>    things have gone from bad to worse.
> 
> To tackle them from behind:
> 
> Wes has a proposal for #3 which is a per-process flag which says
> "I'm sacred".  I think that is a sound principle since that is
> usually exactly what people want:  Do Not Kill This Process.
> 
> Certain processes already enjoy special protection, pid==1 most
> notably, this would just be a way to make the same protection
> available to other processes.  I'm not happy about using the
> resourcelimit code for booleans, and I don't think the flag
> should be inherited, but otherwise I'm for the idea.

JFYI: On ia64 there are 12 bits in the ELF header reserved for OS
specific flags. A very natural way to flag a process as being sacred
is by flagging the ELF executable. You could use brandelf for that.

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  0:48:45 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 5475437B401; Tue, 25 Mar 2003 00:48:43 -0800 (PST)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id B161843F3F; Tue, 25 Mar 2003 00:48:41 -0800 (PST)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2P8mXhV014595;
	Tue, 25 Mar 2003 09:48:33 +0100 (CET)
	(envelope-from phk@phk.freebsd.dk)
To: Marcel Moolenaar <marcel@xcllnt.net>
Cc: David Schultz <das@FreeBSD.ORG>,
	Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>, Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing 
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Tue, 25 Mar 2003 00:42:47 PST."
             <20030325084247.GA17195@dhcp01.pn.xcllnt.net> 
Date: Tue, 25 Mar 2003 09:48:33 +0100
Message-ID: <14594.1048582113@critter.freebsd.dk>
X-Spam-Status: No, hits=-7.1 required=5.0
	tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

In message <20030325084247.GA17195@dhcp01.pn.xcllnt.net>, Marcel Moolenaar writ
es:

>> To tackle them from behind:
>> 
>> Wes has a proposal for #3 which is a per-process flag which says
>> "I'm sacred".  I think that is a sound principle since that is
>> usually exactly what people want:  Do Not Kill This Process.
>> 
>> Certain processes already enjoy special protection, pid==1 most
>> notably, this would just be a way to make the same protection
>> available to other processes.  I'm not happy about using the
>> resourcelimit code for booleans, and I don't think the flag
>> should be inherited, but otherwise I'm for the idea.
>
>JFYI: On ia64 there are 12 bits in the ELF header reserved for OS
>specific flags. A very natural way to flag a process as being sacred
>is by flagging the ELF executable. You could use brandelf for that.

Many years ago, we had a local hack so you could specify the nice(2)
that a given program would be executed at (relative to the parent
process) in the a.out file.  This allowed us to keep games open
during the day because we could argue that running at -20 they used
only resources not otherwise claimed.

Other operating systems have much more expressive facilities for
putting attributes on a program.  In some cases this is being held
stronly against them.

I think, but am not sure, that we can now introduce practically any
policy we might like with MAC. (NB: deliberate rwatson-trigger)

How the flags/attributes gets to be set on the wanted subset of
processes is by no means uninteresting, but until something pays
attention to the flag...

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  2:45: 5 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2683037B401
	for <freebsd-arch@FreeBSD.org>; Tue, 25 Mar 2003 02:45:00 -0800 (PST)
Received: from HAL9000.homeunix.com (12-233-57-131.client.attbi.com [12.233.57.131])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7758A43FD7
	for <freebsd-arch@FreeBSD.org>; Tue, 25 Mar 2003 02:44:59 -0800 (PST)
	(envelope-from das@FreeBSD.org)
Received: from HAL9000.homeunix.com (localhost [127.0.0.1])
	by HAL9000.homeunix.com (8.12.6/8.12.5) with ESMTP id h2PAitah006067;
	Tue, 25 Mar 2003 02:44:55 -0800 (PST)
	(envelope-from das@FreeBSD.org)
Received: (from das@localhost)
	by HAL9000.homeunix.com (8.12.6/8.12.5/Submit) id h2PAisFv006066;
	Tue, 25 Mar 2003 02:44:54 -0800 (PST)
	(envelope-from das@FreeBSD.org)
Date: Tue, 25 Mar 2003 02:44:54 -0800
From: David Schultz <das@FreeBSD.org>
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>, Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.org
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325104454.GA5934@HAL9000.homeunix.com>
Mail-Followup-To: Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>, Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.org
References: <20030325075342.GA5450@HAL9000.homeunix.com> <14382.1048580753@critter.freebsd.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <14382.1048580753@critter.freebsd.dk>
X-Spam-Status: No, hits=-19.6 required=5.0
	tests=IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Thus spake Poul-Henning Kamp <phk@phk.freebsd.dk>:
> As I see it, there is a need for several mechanisms:
> 
> 1. A mechanism to export to userland enough information about the
>    current RAM availability, so that phkmalloc and application
>    specific code can make intelligent choices before things go bad.
> 
> 2. A mechanism to alert userland to the fact that things _have_ gone
>    bad.
> 
> 3. A mechanism to influence the "Who do we kill ?" decision once
>    things have gone from bad to worse.

I completely agree, and in my last email I attempted to address
the fact that #2 and #3 are distinct, and to say that people
shouldn't be complaining about Wes's solution to #3 because it
doesn't address #2.

For #1 and #2, we could have a SIGVM (your terminology from the
*last* time this came up) to notify processes about material
changes in global resource availability.  Applications could then
look at that "kernel info" page upon receiving the signal and take
appropriate action.

I think the hardest part is getting applications to use a
proprietary facility.  (For example, look at how few people are
using kqueue for all of its advantages.)  Certainly it could be
added to base system programs, but it would be most useful for
applications such as postgresql and apache.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  8:35: 0 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0060537B404
	for <freebsd-arch@freebsd.org>; Tue, 25 Mar 2003 08:34:53 -0800 (PST)
Received: from mail.speakeasy.net (mail11.speakeasy.net [216.254.0.211])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 661E843F3F
	for <freebsd-arch@freebsd.org>; Tue, 25 Mar 2003 08:34:53 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: (qmail 18913 invoked from network); 25 Mar 2003 16:34:56 -0000
Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63])
          (envelope-sender <jhb@FreeBSD.org>)
          by mail11.speakeasy.net (qmail-ldap-1.03) with DES-CBC3-SHA encrypted SMTP
          for <phk@phk.freebsd.dk>; 25 Mar 2003 16:34:56 -0000
Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1])
	by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h2PGYoOv096194;
	Tue, 25 Mar 2003 11:34:50 -0500 (EST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.20030325113450.jhb@FreeBSD.org>
X-Mailer: XFMail 1.5.4 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200303241805.38175.wes@softweyr.com>
Date: Tue, 25 Mar 2003 11:34:50 -0500 (EST)
From: John Baldwin <jhb@FreeBSD.org>
To: Wes Peters <wes@softweyr.com>
Subject: Re: Patch to protect process from pageout killing
Cc: freebsd-arch@freebsd.org, Poul-Henning Kamp <phk@phk.freebsd.dk>
X-Spam-Status: No, hits=-19.5 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG


On 25-Mar-2003 Wes Peters wrote:
> On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote:
>> Also, doesn't this result in the flag being inerited with fork() and
>> thereby negating the effect you are seeking for squid ?
> 
> I looked through all the places in kern_fork.c where p2->p_flag gets set 
> and didn't see anything that looked like it would inherit P_PROTECTED 
> from p1->p_flag.  Did I miss something?  I'm obviously a bit of a 
> neophyte in this part of the kernel.

rlimit's are inherited.  However, due to a "feature" bug in your patch,
the P_PROTECTED flag doesn't get turned on when the rlimit is inherited
in fork1().

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  9: 6:47 2003
Delivered-To: freebsd-arch@freebsd.org
Received: by hub.freebsd.org (Postfix, from userid 683)
	id 368B137B401; Tue, 25 Mar 2003 09:06:44 -0800 (PST)
Date: Tue, 25 Mar 2003 09:06:43 -0800
From: Eivind Eklund <eivind@FreeBSD.org>
To: David Schultz <das@FreeBSD.ORG>
Cc: Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>,
	Poul-Henning Kamp <phk@phk.freebsd.dk>,
	Wes Peters <wes@softweyr.com>, freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
Message-ID: <20030325090643.F20745@FreeBSD.org>
References: <200303240823.48262.wes@softweyr.com> <7019.1048523782@critter.freebsd.dk> <20030324213519.GA63147@dan.emsphone.com> <20030325012844.GB4406@HAL9000.homeunix.com> <p05200f42baa5897e8dd8@[128.113.24.47]> <20030325075342.GA5450@HAL9000.homeunix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5.1i
In-Reply-To: <20030325075342.GA5450@HAL9000.homeunix.com>; from das@FreeBSD.ORG on Mon, Mar 24, 2003 at 11:53:42PM -0800
X-Spam-Status: No, hits=-32.5 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

On Mon, Mar 24, 2003 at 11:53:42PM -0800, David Schultz wrote:
> Yes, I think the SIGDANGER idea makes sense.  Essentially what
> you'd want is a higher threshhold above the ``red alert---start
> killing things'' threshhold where you can do smart things like
> send SIGDANGERs without worrying about running completely out of
> memory.  But I'm trying to impress on people that SIGDANGER is
> orthogonal to what Wes is trying to do, before the whole thing
> gets bogged down in discussions again and nothing ever happens.
> Here's an example of what I mean in verbose pseudocode with
> fudged constants:
> 
> 	if (free VM < 64 pages) {
> 		/* This is the part Wes is working on. */
> 		kill big processes EXCEPT the ones that
> 		    are so important that there's no point
> 		    in running the system without them;
> 	} else if (free VM < 256 pages) {
> 		/*
> 		 * It takes additional memory to do this, but we're
> 		 * hoping some processes will cooperate and the
> 		 * shortage will go away.
> 		 */
> 		start warning processes with SIGDANGER;
> 	}

As far as I understand, this problem was covered by the SIGDANGER proposal,
by having processes with a SIGDANGER handler not be killed in the
(free VM < 64 pages) case, at least until there are no processes without a
SIGDANGER handler.

The pseudo-code becomes something like (with 64, 128 and 256 being arbitrary
constants)

	if (free VM < 256 pages) {
		send SIGDANGER to all processes
	}
	while (free VM < 128 pages &&
	    we have processes without SIGDANGER handler) {
		kill "worst" process without SIGDANGER handler
	}
	while (free VM < 64 pages) {
		/*
		 * Only goes here if we are out of processes without
		 * SIGDANGER handler
		 */
		kill "worst" process
	}

As you see, SIGDANGER says that the process wants to decide for itself if
it is important (kept until 64) or want to die/free up resources at 256.

I'm not 100% happy with the SIGDANGER API, for the following reasons:

(1) There are cases it does not cover.  I can implement a process that
    is not really significant, but does caching and can easily free up
    memory.  However, even though it frees up memory, it should not get
    special priority in the killing sequence.
    The most extreme example of this is if we add SIGDANGER awareness
    to phkmalloc - in that case, all newly compiled programs (that use
    libc and malloc) would gain priority, while all old binaries would
    be prioritized lower.

(2) The use an API instead of an external configuration option (e.g. a sysctl
    with a list of protected PIDs) makes it impossible to use this without
    having recompiling software.  It also means that priority is determined
    at the time of software implementation, not when the software is 
    deployed, unless there are special options in the software to change
    behaviour.  And these options are likely to appear, which basically
    sucks.


However, I still feel that we *should* support the SIGDANGER API.  We need
an API to cover the case where a program has resources that it can easily
release, and the API should be cross-platform.  By supporting the SIGDANGER
API on FreeBSD too, that API becomes aboutr 4x as "legitimate" as it is
today.  If we implement another API, both SIGDANGER and that API will be
seen as less legitimate than SIGDANGER is today, unless that API is *much*
better than SIGDANGER.  Thus, we lower the chance of ever getting a true
cross-platform solution.

I feel there also is room for a separate solution that lets the administrator
determine processes to keep - but this should not block for implementation
of SIGDANGER with the standard semantics, and that is IMO what would be
most important to have.

Also note that a SIGDANGER implementation might automatically be picked up
by autoconf for already existing programs, giving an immediate benefit.

Eivind, who realize he has no vote until he has patches.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25  9:22:31 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id C63A437B401; Tue, 25 Mar 2003 09:22:27 -0800 (PST)
Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id EF33643FAF; Tue, 25 Mar 2003 09:22:16 -0800 (PST)
	(envelope-from dcs@tcoip.com.br)
Received: from tcoip.com.br ([10.0.2.6])
	by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2PHLf923653;
	Tue, 25 Mar 2003 14:21:41 -0300
Message-ID: <3E809024.1050303@tcoip.com.br>
Date: Tue, 25 Mar 2003 14:21:40 -0300
From: "Daniel C. Sobral" <dcs@tcoip.com.br>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030214
X-Accept-Language: en-us, en, pt-br, ja
MIME-Version: 1.0
To: Poul-Henning Kamp <phk@phk.freebsd.dk>
Cc: David Schultz <das@FreeBSD.ORG>,
	Garance A Drosihn <drosih@rpi.edu>,
	Dan Nelson <dnelson@allantgroup.com>, Wes Peters <wes@softweyr.com>,
	freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
References: <14382.1048580753@critter.freebsd.dk>
In-Reply-To: <14382.1048580753@critter.freebsd.dk>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-19.2 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

Poul-Henning Kamp wrote:
> If we are going to do this, we should do it right.
> 
> Doing it right means that we should also be sharing enough information
> with userland, so that userland can adapt.
> 
> Take a simple example:  It makes sense for a program like fsck to
> use all the RAM it can get hold off as cache, but it does not make
> sense for the cache to be paged out.
> 
> As I see it, there is a need for several mechanisms:

SIGDANGER actually takes care of two of these steps:

> 
> 1. A mechanism to export to userland enough information about the
>    current RAM availability, so that phkmalloc and application
>    specific code can make intelligent choices before things go bad.
> 
> 2. A mechanism to alert userland to the fact that things _have_ gone
>    bad.

SIGDANGER is sent to processes at threshold #1, alerting them the 
situation has become serious.

> 3. A mechanism to influence the "Who do we kill ?" decision once
>    things have gone from bad to worse.

When the situation becomes so critical that the system cannot proceed 
without first killing something, it will only kill a process which has 
registered SIGDANGER if there are no other suitable process.

-- 
Daniel C. Sobral                   (8-DCS)
Gerencia de Operacoes
Divisao de Comunicacao de Dados
Coordenacao de Seguranca
TCO
Fones: 55-61-313-7654/Cel: 55-61-9618-0904
E-mail: Daniel.Capo@tco.net.br
         Daniel.Sobral@tcoip.com.br
         dcs@tcoip.com.br

Outros:
	dcs@newsguy.com
	dcs@freebsd.org
	capo@notorious.bsdconspiracy.net

If you put your supper dish to your ear you can hear the sounds of a
restaurant.
		-- Snoopy


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue Mar 25 18:59:46 2003
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D4A9F37B401
	for <arch@freebsd.org>; Tue, 25 Mar 2003 18:59:41 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1402943F3F
	for <arch@freebsd.org>; Tue, 25 Mar 2003 18:59:41 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q2xeT43472
	for <arch@freebsd.org>; Tue, 25 Mar 2003 21:59:40 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Tue, 25 Mar 2003 21:59:40 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: arch@freebsd.org
Subject: 1:1 Threading implementation.
Message-ID: <20030325214028.K64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=0.0 required=5.0
	tests=none
	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
List-ID: <freebsd-arch.FreeBSD.ORG>
List-Archive: <http://docs.freebsd.org/mail/> (Web Archive)
List-Help: <mailto:majordomo@FreeBSD.ORG?subject=help> (List Instructions)
List-Subscribe: <mailto:majordomo@FreeBSD.ORG?subject=subscribe%20freebsd-arch>
List-Unsubscribe: <mailto:majordomo@FreeBSD.ORG?subject=unsubscribe%20freebsd-arch>
X-Loop: FreeBSD.ORG

I realize that many people have strong feelings on this topic.  I'm asking
everyone up front to try not to devolve this thread into a bikeshed.

Thanks to the foundation provided by Julian, David Xu, Mini, Dan Eischen,
and everyone else who has participated with KSE and libpthread development
Mini and I have developed a 1:1 threading implementation.  This code works
in parallel with KSE and does not break it in any way.  It actually helps
bring M:N threading closer by testing out shared bits.

I have successfully run mozilla 1.2.1 using this threading package.  It
still has some bugs and some incomplete corners but we're very close to
being able to commit this.  I'm going to post a link to the kernel portion
of this code at the end of this mail.  The library will come later.

What this means is that for every pthread in an application there is one
KSE and thread.  There is also only one ksegroup per proc in this model.
Since the kernel knows about all threads it handles all scheduling
decisions and all signal delivery.  I have followed the POSIX spec while
implementing the signal code.  I would really appreciate review from
anyone who is intimately familiar with signals and threads.  Included in
this is an implementation of sigwait(), sigtimedwait(), and sigwaitinfo().

The user land mutexes are supported by kernel code.  Uncontested acquires
and releases are done entirely in application space using atomic
instructions.  Once there is contention the library falls back to system
calls to handle the locks.  There are no per lock kernel resources
allocated.  There is a user space safe atomic cmpset function that has
been defined for x86 only at the moment.  New architectures require only
this function and the *context apis to run this threading package.  There
is no arch specific code in user space.

The condition variables and other blocking situations are handled with
sig*wait*() and a new signal, SIGTHR.  There are many reasons that we went
with a signal here.  If anyone cares to know them, you may ask.

There are only 4 system calls for threading. thr_create, thr_self,
thr_exit, and thr_kill.  The rest of the functionality is implemented in a
library that has been heavily hacked up from the original libc_r.

The reason we're doing this in parallel with the M:N effort is so that we
can have reasonable threading sooner.  As I stated before, this project is
complimentary to KSE and does not prohibit it from working.  I also think
that the performance will be better or comparable in the majority of real
applications.

The kernel bits are available at
http://www.chesapeake.net/~jroberson/thr.diff

I'd like to get the signal code commited asap.  It's the majority of the
patch and I often have to resolve conflicts.  There have been no
regressions in KSE or non threaded applications with this signal code.

Cheers,
Jeff


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch@FreeBSD.ORG  Tue Mar 25 19:52:13 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 657D537B401
	for <arch@FreeBSD.ORG>; Tue, 25 Mar 2003 19:52:13 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 93FF843F75
	for <arch@FreeBSD.ORG>; Tue, 25 Mar 2003 19:52:12 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q3qCC65884
	for <arch@FreeBSD.ORG>; Tue, 25 Mar 2003 22:52:12 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Tue, 25 Mar 2003 22:52:12 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: arch@FreeBSD.ORG
In-Reply-To: <20030325214028.K64602-100000@mail.chesapeake.net>
Message-ID: <20030325224156.F64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-1.7 required=5.0
	tests=AWL,IN_REP_TO
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 03:52:13 -0000
X-List-Received-Date: Wed, 26 Mar 2003 03:52:13 -0000

I pooched the patch.  It's updated at the same web address.

http://www.chesapeake.net/~jroberson/thr.diff

Cheers,
Jeff

From owner-freebsd-arch@FreeBSD.ORG  Tue Mar 25 23:00:33 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1468537B412
	for <arch@freebsd.org>; Tue, 25 Mar 2003 23:00:33 -0800 (PST)
Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com
	[218.97.164.167])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2BEC343F3F
	for <arch@freebsd.org>; Tue, 25 Mar 2003 23:00:29 -0800 (PST)
	(envelope-from davidxu@freebsd.org)
Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by
	exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service
	Version 5.5.2650.21)	id HLDQN57S; Wed, 26 Mar 2003 14:46:32 +0800
Message-ID: <00f101c2f365$8de4e530$f001a8c0@davidw2k>
From: "David Xu" <davidxu@freebsd.org>
To: "Jeff Roberson" <jroberson@chesapeake.net>, <arch@freebsd.org>
References: <20030325224156.F64602-100000@mail.chesapeake.net>
Date: Wed, 26 Mar 2003 15:01:41 +0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4807.1700
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300
X-Spam-Status: No, hits=-10.1 required=5.0
	tests=AWL,QUOTED_EMAIL_TEXT,REFERENCES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 07:00:34 -0000
X-List-Received-Date: Wed, 26 Mar 2003 07:00:34 -0000

I am reading the code, although not very understand
your idea, but I found a problem, if a thread exits,
some signals taken by the thread will be lost even
the signal originally is not for the thread.

David Xu

----- Original Message -----=20
From: "Jeff Roberson" <jroberson@chesapeake.net>
To: <arch@freebsd.org>
Sent: Wednesday, March 26, 2003 11:52 AM
Subject: Re: 1:1 Threading implementation.


> I pooched the patch.  It's updated at the same web address.
>=20
> http://www.chesapeake.net/~jroberson/thr.diff
>=20
> Cheers,
> Jeff
>=20
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to =
"freebsd-arch-unsubscribe@freebsd.org"

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 00:37:01 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CD37537B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 00:37:01 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DEE3D43FB1
	for <arch@freebsd.org>; Wed, 26 Mar 2003 00:37:00 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q8avX78331;
	Wed, 26 Mar 2003 03:36:57 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Wed, 26 Mar 2003 03:36:57 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Julian Elischer <julian@elischer.org>
In-Reply-To: <Pine.BSF.4.21.0303252335280.22804-100000@InterJet.elischer.org>
Message-ID: <20030326031245.O64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-13.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 08:37:03 -0000
X-List-Received-Date: Wed, 26 Mar 2003 08:37:03 -0000

On Wed, 26 Mar 2003, Julian Elischer wrote:
> On Tue, 25 Mar 2003, Jeff Roberson wrote:
>
> > Thanks to the foundation provided by Julian, David Xu, Mini, Dan Eischen,
> > and everyone else who has participated with KSE and libpthread development
> > Mini and I have developed a 1:1 threading implementation.  This code works
> > in parallel with KSE and does not break it in any way.  It actually helps
> > bring M:N threading closer by testing out shared bits.
>
> The current design was done specifically so that the 'component parts
> could be recombined in different groupings to give different threading
> models. This was one of the models considered when the group
> discussed it. I'm glad that it is working..

Yep, that was a good design goal.

> >
> > I have successfully run mozilla 1.2.1 using this threading package.  It
> > still has some bugs and some incomplete corners but we're very close to
> > being able to commit this.  I'm going to post a link to the kernel portion
> > of this code at the end of this mail.  The library will come later.
>
> I wondered what was going on there.. There's been a trmendous silence in
> the userland side of things.

Well, I wasn't doing userland stuff until three days ago.  I think mini
has just been very busy with work.  I suspect that you're going to need
to start doing userland work or find someone to do it if you want to get
it done soon.

> >
> > What this means is that for every pthread in an application there is one
> > KSE and thread.  There is also only one ksegroup per proc in this model.
> > Since the kernel knows about all threads it handles all scheduling
> > decisions and all signal delivery.  I have followed the POSIX spec while
> > implementing the signal code.  I would really appreciate review from
> > anyone who is intimately familiar with signals and threads.  Included in
> > this is an implementation of sigwait(), sigtimedwait(), and sigwaitinfo().
>
> Wouldn't it have been easier to have one KSEGRP+KSE+thread per user
> thread? Having one ksegrp and many KSEs requires changing the kernel
> code where doing it the other way you could do it without making any
> changes.

I don't understand?  There are relatively minor changes to the kernel to
support this.  Since nice is a property of the process, it makes sense
that there is only one ksegrp per process.  I'm starting to think that the
ksegrp was overkill in general.

> Specifically since My plan is to make the "KSE' structure go away..
> (by which I mean it is only going to be visible within the particular
> thread_scheduler that uses it and that externally
> the only structures visible would be :
> proc, ksegrp(subproc?) thread and upcall.

For M:N I really think this should be proc, thread, and upcall.
For 1:1 I only need proc and thread.

> The KSE would be allocated only by a call into the scheduler and is part
> of the "scheduler specific private data".
>
> i.e. on creation of a new process, shced_newproc() is called
> and a KSE is added in there is the scheduler in question wants to use
> KSEs. If it doesn't, no KSE would be added, but it's still possible that

Yes, I think we need more sched hooks here as well.  Having only
sched_fork() makes things sort of gross.  We'll have to hook this all up
later.

> some scheduler specific storage might be added. In the case
> of a new upcall being declared (kse_create() (to be renamed))
> sched_make_threaded() is called which adds KSEs to the KSEGRP
> (I was going to change it to be called a subprocess).
> KSEs are an accounting aid for the scheduler. A differnt scheduler may
> decide to put threads themselves onto the run queues which would
> make KSEs un-needed. (for example)
>
> >
> > The user land mutexes are supported by kernel code.  Uncontested acquires
> > and releases are done entirely in application space using atomic
> > instructions.  Once there is contention the library falls back to system
> > calls to handle the locks.  There are no per lock kernel resources
> > allocated.  There is a user space safe atomic cmpset function that has
> > been defined for x86 only at the moment.  New architectures require only
> > this function and the *context apis to run this threading package.  There
> > is no arch specific code in user space.
>
> This was discussed recently as being the highlight of someone's
> threading model (I think Linux but I am not sure who's).

Yes, linux was discussing this.  It's a pretty common trick.  Even NT does
it but apparently NT allocates kernel resources for user locks.  I was
pretty pleased that I got away without any per lock allocations.

> >
> > The condition variables and other blocking situations are handled with
> > sig*wait*() and a new signal, SIGTHR.  There are many reasons that we went
> > with a signal here.  If anyone cares to know them, you may ask.
> >
> > There are only 4 system calls for threading. thr_create, thr_self,
> > thr_exit, and thr_kill.  The rest of the functionality is implemented in a
> > library that has been heavily hacked up from the original libc_r.
> >
> > The reason we're doing this in parallel with the M:N effort is so that we
> > can have reasonable threading sooner.  As I stated before, this project is
> > complimentary to KSE and does not prohibit it from working.  I also think
> > that the performance will be better or comparable in the majority of real
> > applications.
>
> My only comment is that since mini is supposed to be doing the
> M:N library, isn't this a bit of a distraction?

I'll let him comment on this.

> >
> > The kernel bits are available at
> > http://www.chesapeake.net/~jroberson/thr.diff
>
> Please explain what this means:
> -       mask = td->td_proc->p_sigmask;
> +       mask = td->td_sigmask;
>
>
> how can you have a per thread mask?
> Signals are masked for the entire process..
> How do you keep them in sync with each other?

As per POSIX each thread has a signal mask.  There is a per process
sigaction but per thread mask and pending.  This has to be the case even
for M:N although some of it is hidden by the UTS.  libc_r even keeps per
thread pending and mask bits.

> -       if (p1->p_flag & P_THREADED) {
> +       if (p1->p_flag & P_THREADED || p1->p_numthreads > 1) {
>
> If you are running threads, please set the P_THREADED flag.
> if you wnat do differentiate between upcalling threads and 1:1
> threads, please use some auxhilliary flag.

I'd rather not have a flag.  The > 1 check is used only in places where we
have to suspend multiple threads or go to single threading etc.  Processes
in the 1:1 threading model aren't so special as they are with KSE.  They
don't need to be treated specially except when we're trying to funnel them
down etc.

> You should be creating a new KSEGRP (subproc) per thread.
> I think you will find that if you do, things will fall out easier
> and you won't break the next KSE changes.

I don't understand what I may break?

> >
> > I'd like to get the signal code commited asap.  It's the majority of the
> > patch and I often have to resolve conflicts.  There have been no
> > regressions in KSE or non threaded applications with this signal code.
>
> I'm not against having a separate 1:1 thread capability, but
> all this work could have been well spent getting M:N threads
> better supported and even getting it to
> be able to run in 1:1 mode a s a byproduct..

I don't think M:N is the way to go.  After looking things over and
considering where it is theoretically faster I do not think it is a
worthwhile pursuit.

First off, it is many months away from being even beta quality.  I think
the UTS is far more complicated than you may realize.  There are all sorts
of synchronization issues that it was able to avoid before since only one
thread could run at any time and there essentially was no preemption.  It
now also has to deal with effecient scheduling decisions in a M:N model
that it didn't have to worry about before.

Aside from that, there are numerous problems with the kernel not being
able to identify individual threads of execution.  Debugging, scheduling,
profiling, ktrace are all more difficult in a m:n environment.  I think it
is going to contribute to less effecient scheduling decisions over all.  I
have already wrestled with this in ULE.

I feel that this is an overwhelming amount of complexity.  Because of this
it will be buggy.  Sun claims that they still have open tickets on their
M:N while their new 1:1 implementation is totally bug free.  How long have
they been doing m:n?  I don't think that with our limited resources we're
going to be able to do better.

Furthermore, m:n's basic advantage is less overhead from staying out of
the kernel.  Also, less per thread resources.  I think this is bogus for a
couple of reasons.

First, if your application has more threads than cpus it is written
incorrectly.  For people who are doing thread pools instead of event
driven IO models they will encounter the same overhead with M:N as 1:1.
I'm not sure what applications are entirely compute and have more threads
than cpus.  These are the only ones which really theoretically benefit.  I
don't think our threading model should be designed to optimize poorly
thought out applications.

Furthermore, the amount of work done per slice has been growing with
processor speeds.  Slice time is adjusted for user experience and so it
remains constant.  This means that the constraints are different from when
this architecture started to come about many (10 or so?) years ago.
Trying to optimize context switches between threads just doesn't make
sense when you do so much work per slice.

Then if you look at the number of system calls and shenanigans a UTS must
do to make proper scheduling decisions it doesn't look like such an
advantage.  I feel that the overhead of all the layers comes close to the
savings from doing some of it without entering the kernel.

In short, even if it is marginally faster, it doesn't seem like it is
worth the effort and risk.  I don't want to discourage you from trying but
this is why I stopped working on KSE proper and pursued the 1:1 model.

Cheers,
Jeff

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 00:37:47 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 51FEF37B404; Wed, 26 Mar 2003 00:37:47 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 87F7B43FA3; Wed, 26 Mar 2003 00:37:46 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q8bkR78633;
	Wed, 26 Mar 2003 03:37:46 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Wed, 26 Mar 2003 03:37:46 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: David Xu <davidxu@freebsd.org>
In-Reply-To: <00f101c2f365$8de4e530$f001a8c0@davidw2k>
Message-ID: <20030326033727.Q64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-15.1 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 08:37:50 -0000
X-List-Received-Date: Wed, 26 Mar 2003 08:37:50 -0000

On Wed, 26 Mar 2003, David Xu wrote:

> I am reading the code, although not very understand
> your idea, but I found a problem, if a thread exits,
> some signals taken by the thread will be lost even
> the signal originally is not for the thread.
>
> David Xu

You're absolutely right.  Thanks.  I'll fix this.

Cheers,
Jeff

> ----- Original Message -----
> From: "Jeff Roberson" <jroberson@chesapeake.net>
> To: <arch@freebsd.org>
> Sent: Wednesday, March 26, 2003 11:52 AM
> Subject: Re: 1:1 Threading implementation.
>
>
> > I pooched the patch.  It's updated at the same web address.
> >
> > http://www.chesapeake.net/~jroberson/thr.diff
> >
> > Cheers,
> > Jeff
> >
> > _______________________________________________
> > freebsd-arch@freebsd.org mailing list
> > http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
>

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 01:18:29 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 705D737B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 01:18:29 -0800 (PST)
Received: from skynet.stack.nl (skynet.stack.nl [131.155.140.225])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8B30243F93
	for <arch@freebsd.org>; Wed, 26 Mar 2003 01:18:28 -0800 (PST)
	(envelope-from marcolz@stack.nl)
Received: by skynet.stack.nl (Postfix, from userid 65534)
	id 0D43B3E32; Wed, 26 Mar 2003 10:19:01 +0100 (CET)
Received: from turtle.stack.nl (turtle.stack.nl [2001:610:1108:5010::132])
	by skynet.stack.nl (Postfix) with ESMTP
	id B85843E2D; Wed, 26 Mar 2003 10:19:00 +0100 (CET)
Received: by turtle.stack.nl (Postfix, from userid 333)
	id DE2A11CC2D; Wed, 26 Mar 2003 10:18:26 +0100 (CET)
Date: Wed, 26 Mar 2003 10:18:26 +0100
From: Marc Olzheim <marcolz@stack.nl>
To: Jeff Roberson <jroberson@chesapeake.net>
Message-ID: <20030326091826.GA79113@stack.nl>
References: <Pine.BSF.4.21.0303252335280.22804-100000@InterJet.elischer.org>
	<20030326031245.O64602-100000@mail.chesapeake.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030326031245.O64602-100000@mail.chesapeake.net>
X-Operating-System: FreeBSD turtle.stack.nl 5.0-CURRENT FreeBSD 5.0-CURRENT
X-URL: http://www.stack.nl/~marcolz/
User-Agent: Mutt/1.5.4i
X-Spam-Status: No, hits=-32.5 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 09:18:30 -0000

On Wed, Mar 26, 2003 at 03:36:57AM -0500, Jeff Roberson wrote:
> First, if your application has more threads than cpus it is written
> incorrectly.  For people who are doing thread pools instead of event
> driven IO models they will encounter the same overhead with M:N as 1:1.
> I'm not sure what applications are entirely compute and have more threads
> than cpus.  These are the only ones which really theoretically benefit.  I
> don't think our threading model should be designed to optimize poorly
> thought out applications.

Might I suggest that there are 'nice' C++ ways of using thread-classes
where both the usual C++ dogmas of readability and reuseability make you
easily end up with more threads than cpus...
I think that from a userland's point of view, most programmers shouldn't
be caring less about how many cpus the machine has their core is running
on.

With this (not limited to) C++ model in mind, the M:N way would be a
great thing to have.

Zlo

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 01:23:31 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D8B8D37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 01:23:31 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 15BE643FA3
	for <arch@freebsd.org>; Wed, 26 Mar 2003 01:23:31 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2Q9NLi96163;
	Wed, 26 Mar 2003 04:23:21 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Wed, 26 Mar 2003 04:23:21 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Marc Olzheim <marcolz@stack.nl>
In-Reply-To: <20030326091826.GA79113@stack.nl>
Message-ID: <20030326042114.H64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-16.0 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 09:23:32 -0000

On Wed, 26 Mar 2003, Marc Olzheim wrote:

> On Wed, Mar 26, 2003 at 03:36:57AM -0500, Jeff Roberson wrote:
> > First, if your application has more threads than cpus it is written
> > incorrectly.  For people who are doing thread pools instead of event
> > driven IO models they will encounter the same overhead with M:N as 1:1.
> > I'm not sure what applications are entirely compute and have more threads
> > than cpus.  These are the only ones which really theoretically benefit.  I
> > don't think our threading model should be designed to optimize poorly
> > thought out applications.
>
> Might I suggest that there are 'nice' C++ ways of using thread-classes
> where both the usual C++ dogmas of readability and reuseability make you
> easily end up with more threads than cpus...
> I think that from a userland's point of view, most programmers shouldn't
> be caring less about how many cpus the machine has their core is running
> on.

Sure, but in these cases you're not likely to be using them in performance
critical code.  Which means you're not likely to be using all of the cpu..
Which means you're going to have to go block in the kernel anyway.  And
so, really what we're talking about is wasted memory here.  Not even many
cpu cycles.

I think people who actually care about performance don't want the M:N
overhead.  1:1 will be faster for them.

For the rest, well, they didn't care about performance and so why should
we work so hard to make it marginally faster for them?

> With this (not limited to) C++ model in mind, the M:N way would be a
> great thing to have.
>
> Zlo
>

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 01:33:59 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9542F37B404
	for <arch@FreeBSD.ORG>; Wed, 26 Mar 2003 01:33:59 -0800 (PST)
Received: from park.rambler.ru (park.rambler.ru [81.19.64.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2663843F85
	for <arch@FreeBSD.ORG>; Wed, 26 Mar 2003 01:33:58 -0800 (PST)
	(envelope-from is@rambler-co.ru)
Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102])
	by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h2Q9XqmF056582;
	Wed, 26 Mar 2003 12:33:52 +0300 (MSK)
Date: Wed, 26 Mar 2003 12:33:52 +0300 (MSK)
From: Igor Sysoev <is@rambler-co.ru>
X-Sender: is@is
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030325214028.K64602-100000@mail.chesapeake.net>
Message-ID: <Pine.BSF.4.21.0303261158180.5080-100000@is>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@FreeBSD.ORG
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 09:34:00 -0000

On Tue, 25 Mar 2003, Jeff Roberson wrote:

> I realize that many people have strong feelings on this topic.  I'm asking
> everyone up front to try not to devolve this thread into a bikeshed.
> 
> Thanks to the foundation provided by Julian, David Xu, Mini, Dan Eischen,
> and everyone else who has participated with KSE and libpthread development
> Mini and I have developed a 1:1 threading implementation.  This code works
> in parallel with KSE and does not break it in any way.  It actually helps
> bring M:N threading closer by testing out shared bits.

I'm very glad to see two kind of the kernel supported threads in FreeBSD.

> The condition variables and other blocking situations are handled with
> sig*wait*() and a new signal, SIGTHR.  There are many reasons that we went
> with a signal here.  If anyone cares to know them, you may ask.

I ask :)

> There are only 4 system calls for threading. thr_create, thr_self,
> thr_exit, and thr_kill.  The rest of the functionality is implemented in a
> library that has been heavily hacked up from the original libc_r.

I think thr_create() should have a optional capability to create a thread's
stack.  This allow to save one syscall because otherwise you need to call
mmap() or malloc()/sbrk() before the_create().

I think that thr_self() should be implemented in the user land.  It's used
in pthread_getspecific(), pthread_setspecific(), and gcc3's __thread attribute
and can be used very often and should be very cheap.

Solaris uses gs register on x86 and %g7 register on Sparc.
Linux also uses gs register on x86, other platforms implementation details
can be found here - http://people.redhat.com/drepper/tls.pdf
Win32 and OS/2 use fs register.

As far as I know FreeBSD 4.x uses gs to proc in a kernel and 5.x uses fs
for some per-CPU data in a kernel.  I think we should use one of these
register to point to the thread specific data in the user land.

> I'd like to get the signal code commited asap.  It's the majority of the
> patch and I often have to resolve conflicts.  There have been no
> regressions in KSE or non threaded applications with this signal code.

Did this signal code supports siginfo ?
FreeBSD 4.x fills zeros in the most siginfo's fileds.


Igor Sysoev
http://sysoev.ru/en/

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 01:49:20 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 226DB37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 01:49:20 -0800 (PST)
Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net
	[207.217.120.189])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2A5EE43F3F
	for <arch@freebsd.org>; Wed, 26 Mar 2003 01:49:19 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0122.cvx21-bradley.dialup.earthlink.net ([209.179.192.122]
	helo=mindspring.com)
	by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18y7Wt-0001PX-00; Wed, 26 Mar 2003 01:49:08 -0800
Message-ID: <3E817735.A388A41C@mindspring.com>
Date: Wed, 26 Mar 2003 01:47:33 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Jeff Roberson <jroberson@chesapeake.net>
References: <20030326031245.O64602-100000@mail.chesapeake.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4d0af189ff8cbe69aee10b1ddebe4d101350badd9bab72f9c350badd9bab72f9c350badd9bab72f9c
X-Spam-Status: No, hits=-21.7 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,QUOTE_TWICE_1,
	      RCVD_IN_OSIRUSOFT_COM,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 09:49:22 -0000

Jeff Roberson wrote:
> Well, I wasn't doing userland stuff until three days ago.  I think mini
> has just been very busy with work.  I suspect that you're going to need
> to start doing userland work or find someone to do it if you want to get
> it done soon.

In theory, the library API will be identical to the pthreads
standard, and will require no changes to programs written to
that standard.  Most threaded programs these days are written
to the standard.  Some threaded programs make invalid assumptions
about rescheduling following an involuntary context switch, or
ability to make particular blocking calls.

The first will still be a problem (e.g. Netscape's Java/JavaScript
GIF rendering engine is probably still serializing requests due to
non-thread reentrancy).

The second should not be an issue, either with your implementation
of the 1:1, or Jon Mini's implemenetation of the N:M model.


> > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user
> > thread? Having one ksegrp and many KSEs requires changing the kernel
> > code where doing it the other way you could do it without making any
> > changes.
> 
> I don't understand?  There are relatively minor changes to the kernel to
> support this.  Since nice is a property of the process, it makes sense
> that there is only one ksegrp per process.  I'm starting to think that the
> ksegrp was overkill in general.

The KSEGRP is, effectively, a virtual processor interface, and
was/is intended for use by the scheduler to ensure CPU affinity
for individual threads, and CPU negaffinity for multiple threads
within a process.  In other words, according to the published
design documents, it's a scheduler artifact.

Personally, I've never seen the need for virtual processors, but
then I've always advocated "intentional start/intentional migration"
for the scheduler model (one of my arguments for a per CPU migration
queue, and a push-model, rather than a pull-model for redistribution
of an unbalanced load).

In a scheduler model where a sheduler *pulls* work, either from
another CPU ready-to-run queue, or from a single ready-to-run
queue that is global to the system (in either case, requiring
locks in the scheduler path, potentially highly contended locks),
the idea of a KSEGRP/"virtual processor" is necessary for globally
migratable and contendable "bookkeeping" objects.

So in the current scheduler implementations, KSEGRP is necessary;
in the 1:1 model, it's necessary, if only to ensure negaffinity
(4 CPU system, process with 4 threads, ensure each thread gets its
own CPU, and does not migrate away from it).

It's also minorly useful to distinguish PTHREAD_SCOPE_SYSTEM
priority groups, when running multiple threads on a single CPU
(either in the common single CPU case, or in the less common SMP
case), as a means of avoiding priority inversion deadlocks.  I
would like to see this done differently, which would get rid of
KSEGRP, but would add a scheduler architecture dependency, which
I think can't be gotten rid of easily.  It's a tradeoff (as usual).


> > i.e. on creation of a new process, shced_newproc() is called
> > and a KSE is added in there is the scheduler in question wants to use
> > KSEs. If it doesn't, no KSE would be added, but it's still possible that
> 
> Yes, I think we need more sched hooks here as well.  Having only
> sched_fork() makes things sort of gross.  We'll have to hook this all up
> later.


You could also take this idea much further.  Specifically, SVR4
flags system calls as "non-blocking", "blocking", and "potentially
blocking".  By doing this, they can lazy-bind context creation for
blocking operations on "blocking" and "potentially blocking" calls,
and avoid it altogether on "non-blocking" and sometimes avoid it on
"potentially blocking" calls.

This can result in a significant overhead savings, if the kernel
implementation evolves, but the user space implementation remains
fixed.

It's good to decouple these things from each other (IMO).


> > This was discussed recently as being the highlight of someone's
> > threading model (I think Linux but I am not sure who's).
> 
> Yes, linux was discussing this.  It's a pretty common trick.  Even NT does
> it but apparently NT allocates kernel resources for user locks.  I was
> pretty pleased that I got away without any per lock allocations.

Everyone does this.  Novell did it back in 1993.  Sun's turnstiles
are based on the tradeoff between spinning and waiting, and how
many times you have to do that before it's worth crossing the
protection domain, and blocking.

When we did this in 1993 (Novell's implementation was primarily
by Dave Hefner, who now works for Microsoft, I believe), we ended
up with 20,000 times the transcation per second performance of
Tuxedo, which was the commercial record holder up to that point.


> > > The reason we're doing this in parallel with the M:N effort is so that we
> > > can have reasonable threading sooner.  As I stated before, this project is
> > > complimentary to KSE and does not prohibit it from working.  I also think
> > > that the performance will be better or comparable in the majority of real
> > > applications.
> >
> > My only comment is that since mini is supposed to be doing the
> > M:N library, isn't this a bit of a distraction?
> 
> I'll let him comment on this.

I'll stick my nose in: I think it's a good idea, since TPTB have
recently made noises on a couple of FreeBSD lists about "rapidly
approaching deadlines for the KSE work".

Consider it insurance on your investment, people.


> > You should be creating a new KSEGRP (subproc) per thread.
> > I think you will find that if you do, things will fall out easier
> > and you won't break the next KSE changes.
> 
> I don't understand what I may break?

See above for KSEGRP reasoning.  I think it's representative,
but, if you have time, you may want to read the documentation
for the KSE project.  If other people want to comment or correct
my own comments in this regard (I have been largely an observer,
since after the second threads meeting where my async call gate
idea was brutally .. uh, "laid to rest" ;^)), they should feel
free to do so.


> > I'm not against having a separate 1:1 thread capability, but
> > all this work could have been well spent getting M:N threads
> > better supported and even getting it to
> > be able to run in 1:1 mode a s a byproduct..
> 
> I don't think M:N is the way to go.  After looking things over and
> considering where it is theoretically faster I do not think it is a
> worthwhile pursuit.
> 
> First off, it is many months away from being even beta quality.  I think
> the UTS is far more complicated than you may realize.  There are all sorts
> of synchronization issues that it was able to avoid before since only one
> thread could run at any time and there essentially was no preemption.  It
> now also has to deal with effecient scheduling decisions in a M:N model
> that it didn't have to worry about before.

I would not recommend abandoning the idea, personally.  There is a
huge -- and I mean *huge* -- amount of literature that likes the
N:M model.

There is also the fact that affinity and quantum are very hard to
maintain on a system with a heterogeneous load.  In other words,
1:1 looks good if the only thing you are running is a single
multithreaded proces, but looks a *lot* less good when you start
running real-world code instead of fictitious benchmarks that
try to make your threading look good (e.g. measuring only thread
context switches, with no process context switch stall barriers,
etc.).


> I feel that this is an overwhelming amount of complexity.  Because of this
> it will be buggy.  Sun claims that they still have open tickets on their
> M:N while their new 1:1 implementation is totally bug free.  How long have
> they been doing m:n?  I don't think that with our limited resources we're
> going to be able to do better.

You can't schedule resources.  They will work on what they want
to, and let anything they don't like just sit there and "rot".

The Sun claims are really specious, IMO.  They have it working,
but how does it compare to, say multiple processes that are
sharing descriptor tables, and not much else, in a work-to-do
model?

I can tell you from personal experience with such a model, that
it *VASTLY* outperforms a 1:1 kernel threading model, even if you
end up running multiple state-machine instances on multiple CPUs.
We got more than a 120X increase in NetWare for UNIX, simply by
changing the client dispatch streams MUX to dispatch to worker
processes instead of threads, in LIFO instead of FIFO order,
simply because it ensured that the process pages you cared about
were more likely to be in core.

1:1 threading is useful for one thing, and one thing only: SMP
scalability of single image processes.  And it's not the best at
doing that.

> Furthermore, m:n's basic advantage is less overhead from staying out of
> the kernel.

No, actually, it's the ability to fully utilize a quantum, and
to not have to make a decision between one of your own threads
and some other process, off the run queue, when making a decision
in the scheduler about what to run next.

If you favor the threads in your own process, then you potentially
starve other processes.

If you favor neither, and treat them like processes, you get none
of the supposed context switch benefits that were supposedly going
to result from using threads instead of processes in the first place.


> First, if your application has more threads than cpus it is written
> incorrectly.

This depends on what those threads are doing.  If they are all doing
the same work, then yes, you are right.  If they are doing different
jobs, then you are wrong; even if most of them are doing the same job,
and a few of them are doing different jobs, you are still wrong, since
job-loading is unlikely to be the same between threads.


> For people who are doing thread pools instead of event driven IO
> models they will encounter the same overhead with M:N as 1:1.

This is actually false.  In 1:1, your thread competes with all
other processes, in order to be the next at the top of the run
queue.  Statitically, you are doing more TLB flushes and shootdowns,
and more L1 and L2 cache chootdowns, than you would otherwise.

Solving this problem without intentional scheduling has been
proben to be N-P incomplete: it is not a problem which is
solvable in polonomyial time.


> I'm not sure what applications are entirely compute and have more threads
> than cpus.  These are the only ones which really theoretically benefit.  I
> don't think our threading model should be designed to optimize poorly
> thought out applications.

By that argument, threads should not be supported at all... 8-) 8-).


> This means that the constraints are different from when
> this architecture started to come about many (10 or so?) years ago.
> Trying to optimize context switches between threads just doesn't make
> sense when you do so much work per slice.

5/6 years ago, depending on who you ask.

But by your same arguments, CPU clock multipliers have grown
to the point that memory bus and I/O bus stalls are so
expensive that SMP makes no sense.


> Then if you look at the number of system calls and shenanigans a UTS must
> do to make proper scheduling decisions it doesn't look like such an
> advantage.

I agree with this one; my original model avoided the problem
entirely by making the POSIX blocking call behaviour a library
on to of an sync kernel interface.  By doing this, kernel
boundary crossings could be minimized automatically.

The pthreads code as it has existed so far, also does a lot of
unecessary kernel boundary crossings in order to handle signal
masking.  In fact, you could establish an intermediate handler
for all signals at the user threads scheduler level, and never
have to worry about most of that crap.

I think the kernel boundary crossing overhead, and the fact
that, in doing so, you tend to relinquish a significant
fraction of remaining quantum (by your own arguments) says
that protection domain crossings are to be avoided at all costs.


> In short, even if it is marginally faster, it doesn't seem like it is
> worth the effort and risk.  I don't want to discourage you from trying but
> this is why I stopped working on KSE proper and pursued the 1:1 model.

I'm glad you pursued it, even though I do not agree with your
reasoning on the value of N:M vs. 1:1.  I view it as "life
insurance" for the KSE code, which some people might be
otherwise tempted to rip out over some arbitrary deadline.

Thank you for your work here, and thank everyone else for
their work, too.

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 02:45:21 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 33C1F37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 02:45:21 -0800 (PST)
Received: from park.rambler.ru (park.rambler.ru [81.19.64.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 704EA43F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 02:45:19 -0800 (PST)
	(envelope-from is@rambler-co.ru)
Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102])
	by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h2QAj2mF058555;
	Wed, 26 Mar 2003 13:45:02 +0300 (MSK)
Date: Wed, 26 Mar 2003 13:45:02 +0300 (MSK)
From: Igor Sysoev <is@rambler-co.ru>
X-Sender: is@is
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030326031245.O64602-100000@mail.chesapeake.net>
Message-ID: <Pine.BSF.4.21.0303261335410.5080-100000@is>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.8 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 10:45:22 -0000

On Wed, 26 Mar 2003, Jeff Roberson wrote:

> > > What this means is that for every pthread in an application there is one
> > > KSE and thread.  There is also only one ksegroup per proc in this model.
> > > Since the kernel knows about all threads it handles all scheduling
> > > decisions and all signal delivery.  I have followed the POSIX spec while
> > > implementing the signal code.  I would really appreciate review from
> > > anyone who is intimately familiar with signals and threads.  Included in
> > > this is an implementation of sigwait(), sigtimedwait(), and sigwaitinfo().
> >
> > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user
> > thread? Having one ksegrp and many KSEs requires changing the kernel
> > code where doing it the other way you could do it without making any
> > changes.
> 
> I don't understand?  There are relatively minor changes to the kernel to
> support this.  Since nice is a property of the process, it makes sense
> that there is only one ksegrp per process.  I'm starting to think that the
> ksegrp was overkill in general.

As I understand all KSEs in one KSEGRP have the same priority.
If you need several thread priority inside a process you need several
KSEGRPs so Julian's suggestion is better.

As far as I know KSEGRP has two orthogonal features:
1) it limits number of KSEs to number of CPU;
2) and it set KSE priority.


Igor Sysoev
http://sysoev.ru/en/


From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 02:58:00 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id EE33637B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 02:58:00 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E298943F3F
	for <arch@freebsd.org>; Wed, 26 Mar 2003 02:57:59 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2QAvtE36962;
	Wed, 26 Mar 2003 05:57:55 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Wed, 26 Mar 2003 05:57:55 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Terry Lambert <tlambert2@mindspring.com>
In-Reply-To: <3E817735.A388A41C@mindspring.com>
Message-ID: <20030326053115.T64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-16.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 10:58:03 -0000

On Wed, 26 Mar 2003, Terry Lambert wrote:

> Jeff Roberson wrote:
> > Well, I wasn't doing userland stuff until three days ago.  I think mini
> > has just been very busy with work.  I suspect that you're going to need
> > to start doing userland work or find someone to do it if you want to get
> > it done soon.
>
> In theory, the library API will be identical to the pthreads
> standard, and will require no changes to programs written to
> that standard.  Most threaded programs these days are written
> to the standard.  Some threaded programs make invalid assumptions
> about rescheduling following an involuntary context switch, or
> ability to make particular blocking calls.

I'm not sure what API compatibility has to do with anything?

> The first will still be a problem (e.g. Netscape's Java/JavaScript
> GIF rendering engine is probably still serializing requests due to
> non-thread reentrancy).
>
> The second should not be an issue, either with your implementation
> of the 1:1, or Jon Mini's implemenetation of the N:M model.

I'm not sure I know what you're talking about.  Blocking calls are either
handled by an upcall in M:N or by having independent contexts in 1:1.

>
>
> > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user
> > > thread? Having one ksegrp and many KSEs requires changing the kernel
> > > code where doing it the other way you could do it without making any
> > > changes.
> >
> > I don't understand?  There are relatively minor changes to the kernel to
> > support this.  Since nice is a property of the process, it makes sense
> > that there is only one ksegrp per process.  I'm starting to think that the
> > ksegrp was overkill in general.
>
> The KSEGRP is, effectively, a virtual processor interface, and
> was/is intended for use by the scheduler to ensure CPU affinity
> for individual threads, and CPU negaffinity for multiple threads
> within a process.  In other words, according to the published
> design documents, it's a scheduler artifact.

This is the KSE, not the KSE group.  The KSE Group was intended to allow
multiple groups of threads with different scheduling algorithms or
different base priorities (nice).

> Personally, I've never seen the need for virtual processors, but
> then I've always advocated "intentional start/intentional migration"
> for the scheduler model (one of my arguments for a per CPU migration
> queue, and a push-model, rather than a pull-model for redistribution
> of an unbalanced load).

The push model suffers from as much as a one tick latency in any
migration.  In many cases probably more than that.  The overhead from
locking queues is far out weighed by the cases where your cpu is sitting
idle due to an unbalanced load.

The latency is one tick because each cpu would have to poll the load of
other cpus at some interval to discover that it is out of balance.  Or it
could check if it had more than one process on the run queue, which seems
a bit silly.  Regardless, you're probably only going to get to make this
decisions once a tick which means the other cpu(s) can sit idle for at
least that long.

Consider a buildworld -j8.  You have many processes rapidly stoping and
starting.  Without a pull a cpu that was very loaded could suddenly end up
with no running processes and have to idle until the other gave it work.
This imbalance is likely to go back in forth, I have observed this
personally when writing ULE.

I think you need both push and pull.  The pull satisfies the case where
you have short lived but rapidly reappearing processes.  The push
solves more long term load imbalance issues.  If you have, for example,
many apache processes that are very busy.  no cpu will go idle, so pull is
ineffective, but they may still be imbalanced.  This is still missing from
ULE.

> In a scheduler model where a sheduler *pulls* work, either from
> another CPU ready-to-run queue, or from a single ready-to-run
> queue that is global to the system (in either case, requiring
> locks in the scheduler path, potentially highly contended locks),
> the idea of a KSEGRP/"virtual processor" is necessary for globally
> migratable and contendable "bookkeeping" objects.

They should only be contended when cpus have nothing to do.  A worthwhile
tradeoff I'd say.

> So in the current scheduler implementations, KSEGRP is necessary;
> in the 1:1 model, it's necessary, if only to ensure negaffinity
> (4 CPU system, process with 4 threads, ensure each thread gets its
> own CPU, and does not migrate away from it).

You're talking about the KSE again.  I think CPU affinity has little to do
with the M:N or 1:1 choice except that it is much more difficult to
achieve CPU affinity when you have to make a multitiered scheduling
decision.  To get real affinity in M:N you need kse to cpu affinity and
thread to kse affinity.  You also the need userland thread to kernel
thread affinity, or at least user land thread to KSE affinity.

> It's also minorly useful to distinguish PTHREAD_SCOPE_SYSTEM
> priority groups, when running multiple threads on a single CPU
> (either in the common single CPU case, or in the less common SMP
> case), as a means of avoiding priority inversion deadlocks.  I
> would like to see this done differently, which would get rid of
> KSEGRP, but would add a scheduler architecture dependency, which
> I think can't be gotten rid of easily.  It's a tradeoff (as usual).
>
>
> > > i.e. on creation of a new process, shced_newproc() is called
> > > and a KSE is added in there is the scheduler in question wants to use
> > > KSEs. If it doesn't, no KSE would be added, but it's still possible that
> >
> > Yes, I think we need more sched hooks here as well.  Having only
> > sched_fork() makes things sort of gross.  We'll have to hook this all up
> > later.
>
>
> You could also take this idea much further.  Specifically, SVR4
> flags system calls as "non-blocking", "blocking", and "potentially
> blocking".  By doing this, they can lazy-bind context creation for
> blocking operations on "blocking" and "potentially blocking" calls,
> and avoid it altogether on "non-blocking" and sometimes avoid it on
> "potentially blocking" calls.

KSE already does better than this by only creating a new context when you
actually block.  The upcall mechanism specifically addresses that need.
This is seperate from what we were discussing above which is allowing the
scheduler to have a chance to initialize data when a new context is
created.

> This can result in a significant overhead savings, if the kernel
> implementation evolves, but the user space implementation remains
> fixed.
>
> It's good to decouple these things from each other (IMO).

Which things?

>
> > > This was discussed recently as being the highlight of someone's
> > > threading model (I think Linux but I am not sure who's).
> >
> > Yes, linux was discussing this.  It's a pretty common trick.  Even NT does
> > it but apparently NT allocates kernel resources for user locks.  I was
> > pretty pleased that I got away without any per lock allocations.
>
> Everyone does this.  Novell did it back in 1993.  Sun's turnstiles
> are based on the tradeoff between spinning and waiting, and how
> many times you have to do that before it's worth crossing the
> protection domain, and blocking.

I think you mean sun's adaptive mutexes.  The turnstile is just the
queue that you block on if I'm remembering correctly.  The blocking queue
I used for umtx is a similar context where the queue migrates among the
blocking threads.

> When we did this in 1993 (Novell's implementation was primarily
> by Dave Hefner, who now works for Microsoft, I believe), we ended
Any relation to hugh?

> up with 20,000 times the transcation per second performance of
> Tuxedo, which was the commercial record holder up to that point.

Sounds good.

>
> > > > The reason we're doing this in parallel with the M:N effort is so that we
> > > > can have reasonable threading sooner.  As I stated before, this project is
> > > > complimentary to KSE and does not prohibit it from working.  I also think
> > > > that the performance will be better or comparable in the majority of real
> > > > applications.
> > >
> > > My only comment is that since mini is supposed to be doing the
> > > M:N library, isn't this a bit of a distraction?
> >
> > I'll let him comment on this.
>
> I'll stick my nose in: I think it's a good idea, since TPTB have
> recently made noises on a couple of FreeBSD lists about "rapidly
> approaching deadlines for the KSE work".
>
> Consider it insurance on your investment, people.

Yes, it isn't necessarily a KSE replacement.

>
> > > You should be creating a new KSEGRP (subproc) per thread.
> > > I think you will find that if you do, things will fall out easier
> > > and you won't break the next KSE changes.
> >
> > I don't understand what I may break?
>
> See above for KSEGRP reasoning.  I think it's representative,
> but, if you have time, you may want to read the documentation
> for the KSE project.  If other people want to comment or correct
> my own comments in this regard (I have been largely an observer,
> since after the second threads meeting where my async call gate
> idea was brutally .. uh, "laid to rest" ;^)), they should feel
> free to do so.
>
>
> > > I'm not against having a separate 1:1 thread capability, but
> > > all this work could have been well spent getting M:N threads
> > > better supported and even getting it to
> > > be able to run in 1:1 mode a s a byproduct..
> >
> > I don't think M:N is the way to go.  After looking things over and
> > considering where it is theoretically faster I do not think it is a
> > worthwhile pursuit.
> >
> > First off, it is many months away from being even beta quality.  I think
> > the UTS is far more complicated than you may realize.  There are all sorts
> > of synchronization issues that it was able to avoid before since only one
> > thread could run at any time and there essentially was no preemption.  It
> > now also has to deal with effecient scheduling decisions in a M:N model
> > that it didn't have to worry about before.
>
> I would not recommend abandoning the idea, personally.  There is a
> huge -- and I mean *huge* -- amount of literature that likes the
> N:M model.
>
> There is also the fact that affinity and quantum are very hard to
> maintain on a system with a heterogeneous load.  In other words,
> 1:1 looks good if the only thing you are running is a single
> multithreaded proces, but looks a *lot* less good when you start
> running real-world code instead of fictitious benchmarks that
> try to make your threading look good (e.g. measuring only thread
> context switches, with no process context switch stall barriers,
> etc.).

Yes, I see what you're getting at.  M:N allows you to keep running until
you've exhausted your whole slice by selecting another thread.  You could
acomplish this in 1:1 by loaning your slice to the next available thread
that was bound to the same cpu and force a switch to that.  That's a neat
idea.  I'll have to look into this for ule.

>
> > I feel that this is an overwhelming amount of complexity.  Because of this
> > it will be buggy.  Sun claims that they still have open tickets on their
> > M:N while their new 1:1 implementation is totally bug free.  How long have
> > they been doing m:n?  I don't think that with our limited resources we're
> > going to be able to do better.
>
> You can't schedule resources.  They will work on what they want
> to, and let anything they don't like just sit there and "rot".
>
> The Sun claims are really specious, IMO.  They have it working,
> but how does it compare to, say multiple processes that are
> sharing descriptor tables, and not much else, in a work-to-do
> model?
>
> I can tell you from personal experience with such a model, that
> it *VASTLY* outperforms a 1:1 kernel threading model, even if you
> end up running multiple state-machine instances on multiple CPUs.
> We got more than a 120X increase in NetWare for UNIX, simply by
> changing the client dispatch streams MUX to dispatch to worker
> processes instead of threads, in LIFO instead of FIFO order,
> simply because it ensured that the process pages you cared about
> were more likely to be in core.

Yeah, the LIFO trick is widely used.  I believe apache does something of
this sort.  It's also discussed on the c10k problem page.  I'm not sure
why you got better perf out of processes than threads though.  This is
sort of confusing.

> 1:1 threading is useful for one thing, and one thing only: SMP
> scalability of single image processes.  And it's not the best at
> doing that.

It's also good at providing extra contexts to block on for IO
worker threads.  Furthermore, It's really good at being implemented
quickly, which is especially important considering that it's 2003 and we
don't have kernel supported threads...

> > Furthermore, m:n's basic advantage is less overhead from staying out of
> > the kernel.
>
> No, actually, it's the ability to fully utilize a quantum, and
> to not have to make a decision between one of your own threads
> and some other process, off the run queue, when making a decision
> in the scheduler about what to run next.

Yeah, I just remembered this bit.  See my answer above.  I think I'll do
this trick in ULE.

> If you favor the threads in your own process, then you potentially
> starve other processes.
>
> If you favor neither, and treat them like processes, you get none
> of the supposed context switch benefits that were supposedly going
> to result from using threads instead of processes in the first place.
>
>
> > First, if your application has more threads than cpus it is written
> > incorrectly.
>
> This depends on what those threads are doing.  If they are all doing
> the same work, then yes, you are right.  If they are doing different
> jobs, then you are wrong; even if most of them are doing the same job,
> and a few of them are doing different jobs, you are still wrong, since
> job-loading is unlikely to be the same between threads.
>
>
> > For people who are doing thread pools instead of event driven IO
> > models they will encounter the same overhead with M:N as 1:1.
>
> This is actually false.  In 1:1, your thread competes with all
> other processes, in order to be the next at the top of the run
> queue.  Statitically, you are doing more TLB flushes and shootdowns,
> and more L1 and L2 cache chootdowns, than you would otherwise.

This is the same argument about using your whole slice eh?

> Solving this problem without intentional scheduling has been
> proben to be N-P incomplete: it is not a problem which is
> solvable in polonomyial time.

eh? Which problem is NP?

>
> > I'm not sure what applications are entirely compute and have more threads
> > than cpus.  These are the only ones which really theoretically benefit.  I
> > don't think our threading model should be designed to optimize poorly
> > thought out applications.
>
> By that argument, threads should not be supported at all... 8-) 8-).

I meant to say 'entirely compute bound'.  If you just want CPU and no IO
then you probably only want as many threads as processors.  This is the
most effecient arrangement.  I'm not arguing against threads although I do
think they are often abused.

>
> > This means that the constraints are different from when
> > this architecture started to come about many (10 or so?) years ago.
> > Trying to optimize context switches between threads just doesn't make
> > sense when you do so much work per slice.
>
> 5/6 years ago, depending on who you ask.
>
> But by your same arguments, CPU clock multipliers have grown
> to the point that memory bus and I/O bus stalls are so
> expensive that SMP makes no sense.

I migh agree with you there.

>
> > Then if you look at the number of system calls and shenanigans a UTS must
> > do to make proper scheduling decisions it doesn't look like such an
> > advantage.
>
> I agree with this one; my original model avoided the problem
> entirely by making the POSIX blocking call behaviour a library
> on to of an sync kernel interface.  By doing this, kernel
> boundary crossings could be minimized automatically.
>
> The pthreads code as it has existed so far, also does a lot of
> unecessary kernel boundary crossings in order to handle signal
> masking.  In fact, you could establish an intermediate handler
> for all signals at the user threads scheduler level, and never
> have to worry about most of that crap.
>
> I think the kernel boundary crossing overhead, and the fact
> that, in doing so, you tend to relinquish a significant
> fraction of remaining quantum (by your own arguments) says
> that protection domain crossings are to be avoided at all costs.

Yes, I agree, and without serious tweaking our current M:N significantly
increases the number of system calls.

>
> > In short, even if it is marginally faster, it doesn't seem like it is
> > worth the effort and risk.  I don't want to discourage you from trying but
> > this is why I stopped working on KSE proper and pursued the 1:1 model.
>
> I'm glad you pursued it, even though I do not agree with your
> reasoning on the value of N:M vs. 1:1.  I view it as "life
> insurance" for the KSE code, which some people might be
> otherwise tempted to rip out over some arbitrary deadline.
>
> Thank you for your work here, and thank everyone else for
> their work, too.
>
> -- Terry
>

Thanks for the feedback.  It has been stimulating.  I still need to
consider multithreading implications of 1:1 for ULE.  This has given me a
bit more to work on there.

Cheers,
Jeff

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 06:31:22 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 40E9437B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 06:31:22 -0800 (PST)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0532743F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 06:31:21 -0800 (PST)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.12.8/8.12.8) with SMTP id h2QEVDjK017662;
	Wed, 26 Mar 2003 09:31:14 -0500 (EST)
	(envelope-from robert@fledge.watson.org)
Date: Wed, 26 Mar 2003 09:31:13 -0500 (EST)
From: Robert Watson <rwaston@freebsd.org>
X-Sender: robert@fledge.watson.org
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030325214028.K64602-100000@mail.chesapeake.net>
Message-ID: <Pine.NEB.3.96L.1030326091839.3931I-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 14:31:23 -0000


On Tue, 25 Mar 2003, Jeff Roberson wrote:

> Thanks to the foundation provided by Julian, David Xu, Mini, Dan
> Eischen, and everyone else who has participated with KSE and libpthread
> development Mini and I have developed a 1:1 threading implementation. 
> This code works in parallel with KSE and does not break it in any way. 
> It actually helps bring M:N threading closer by testing out shared bits. 

My feeling is that this is an excellent strategy to get us productionable
kernel-supported threads for the upcoming 5.x release while permitting
continued R&D (and I think it is R&D) into the M:N threading
possibilities.  One nice thing about this construction is that the cost
was very low given the existing investment in KSE, yet the payoff is very
high.  And it will provide a nice migration path when KSE is
productionable for sites interested in doing that: thread-reliant
applications will no longer be explicitly linked against a non-native
threading package (linuxthreads), which is the status quo for large
threaded applications on FreeBSD right now.  So it seems to me that a
relatively straight-forward strategy gets things moving:

- Get review, testing, and commit this work in short order, and get the
  native threaded support in use.  This will improve support for
  applications like Apache2, MySQL, Open Office, Mozzila, etc, with an
  immediate impact on performance, interactiveity, and throughput for
  these systems, especially for disk I/O intensive activities.  Getting it
  in faster will dramatically increase the chances of fully productionable
  native threads for FreeBSD 5.1.

- We'll also be able to get services like threaded debugging, etc, up more
  easily with this model in the short term, as well as learn a lot more
  about their interactions with threads and what the desired semantics
  are.  This work should have a pay-off for M:N threads easily as well.

- Allow the libkse work to continue over the longer term, and make it
  easier to "plug and play" threading since large threaded apps can use
  either library trivially through library renaming.  The exposed API is
  presumably POSIX, and the ABI to the application should be identical. 
  Any test suites working at the pthreads layer should also immediately
  carry over.  And we've gained some expertise.  :-) 

I think one important thing this will address, and Terry has alluded to
it, is the perception that higher performance threading support is
stalled, and therefore standing in the way of other work.  We have
consumers today who desperately need improved threading support: and they
will benefit a lot from 1:1 in the short term.  They may well benefit more
from M:N in the long term, but I agree that I've had similar concerns
about the scope of the userspace work remaining to be done, especially
from my conversations with Jon Mini.  We may have underestimated this task
substantially; while it could be it falls out naturally, Terry's notion of
"an insurance policy" is far from a bad one.  And since this doesn't
impede KSE (and builds so nicely off of the substantial KSE investment),
the trade-off seems good.

Thanks (to you, but also to Julian, David, and everyone else who has
invested so much into KSE!)

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories


From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 08:52:33 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 2FE5A37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 08:52:33 -0800 (PST)
Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7936743FBF
	for <arch@freebsd.org>; Wed, 26 Mar 2003 08:52:32 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com
	[66.91.236.204])	by smtp-relay.omnis.com (Postfix) with ESMTP
	id E824F431BF; Wed, 26 Mar 2003 08:52:29 -0800 (PST)
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr
To: Poul-Henning Kamp <phk@phk.freebsd.dk>, arch@freebsd.org
Date: Wed, 26 Mar 2003 08:52:25 -0800
User-Agent: KMail/1.5
References: <5238.1048510775@critter.freebsd.dk>
In-Reply-To: <5238.1048510775@critter.freebsd.dk>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303260852.25978.wes@softweyr.com>
X-Spam-Status: No, hits=-26.1 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: moving GEOM around...
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 16:52:34 -0000

On Monday 24 March 2003 04:59, Poul-Henning Kamp wrote:
> A number of people have suggested that the directory layout of GEOM
> sources should be changed.  The main complaint seems to be that
> sys/geom contains both subdirectories (bde) and source files.
>
> I personally don't particularly care about that, and as a matter
> of fact wasn't even aware that was a rule, but if a significant
> number of people think this is wrong I'm willing to repo-copy things
> around and fix it, therefore this strawpoll:
>
> Option 1:  No change

I'll take door number 1, Mr. Kamp.

-- 

        Where am I, and what am I doing in this handbasket?

Wes Peters                                               wes@softweyr.com

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 08:58:40 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 3571437B404; Wed, 26 Mar 2003 08:58:40 -0800 (PST)
Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 8179C43F93; Wed, 26 Mar 2003 08:58:39 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com
	[66.91.236.204])	by smtp-relay.omnis.com (Postfix) with ESMTP
	id 7ADF6436D4; Wed, 26 Mar 2003 08:58:38 -0800 (PST)
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr
To: John Baldwin <jhb@FreeBSD.org>
Date: Wed, 26 Mar 2003 08:58:37 -0800
User-Agent: KMail/1.5
References: <XFMail.20030325113450.jhb@FreeBSD.org>
In-Reply-To: <XFMail.20030325113450.jhb@FreeBSD.org>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303260858.37039.wes@softweyr.com>
X-Spam-Status: No, hits=-26.1 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: Poul-Henning Kamp <phk@phk.freebsd.dk>
cc: freebsd-arch@freebsd.org
Subject: Re: Patch to protect process from pageout killing
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 16:58:42 -0000

On Tuesday 25 March 2003 08:34, John Baldwin wrote:
> On 25-Mar-2003 Wes Peters wrote:
> > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote:
> >> Also, doesn't this result in the flag being inerited with fork() and
> >> thereby negating the effect you are seeking for squid ?
> >
> > I looked through all the places in kern_fork.c where p2->p_flag gets
> > set and didn't see anything that looked like it would inherit
> > P_PROTECTED from p1->p_flag.  Did I miss something?  I'm obviously a
> > bit of a neophyte in this part of the kernel.
>
> rlimit's are inherited.  However, due to a "feature" bug in your patch,
> the P_PROTECTED flag doesn't get turned on when the rlimit is inherited
> in fork1().

feature bug?  If you mean the fact that the setting for P_PROTECTED isn't 
stored in the rlimit, that was intentional.  rlimits are inherited and I 
specifically didn't want that behavior, similar to p_cpulimit.  I still 
agree resource limits are not an ideal interface to use for this, I'll 
look further.

-- 

        Where am I, and what am I doing in this handbasket?

Wes Peters                                               wes@softweyr.com

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 09:13:32 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 5632B37B408
	for <freebsd-arch@freebsd.org>; Wed, 26 Mar 2003 09:13:32 -0800 (PST)
Received: from mail.speakeasy.net (mail14.speakeasy.net [216.254.0.214])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6870643F85
	for <freebsd-arch@freebsd.org>; Wed, 26 Mar 2003 09:13:31 -0800 (PST)
	(envelope-from jhb@FreeBSD.org)
Received: (qmail 24290 invoked from network); 26 Mar 2003 17:13:35 -0000
Received: from unknown (HELO server.baldwin.cx) ([216.27.160.63])
	(envelope-sender <jhb@FreeBSD.org>)encrypted SMTP
	for <freebsd-arch@freebsd.org>; 26 Mar 2003 17:13:35 -0000
Received: from laptop.baldwin.cx (gw1.twc.weather.com [216.133.140.1])
	by server.baldwin.cx (8.12.8/8.12.8) with ESMTP id h2QHDPOv099395;
	Wed, 26 Mar 2003 12:13:26 -0500 (EST)
	(envelope-from jhb@FreeBSD.org)
Message-ID: <XFMail.20030326121325.jhb@FreeBSD.org>
X-Mailer: XFMail 1.5.4 on FreeBSD
X-Priority: 3 (Normal)
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
MIME-Version: 1.0
In-Reply-To: <200303260858.37039.wes@softweyr.com>
Date: Wed, 26 Mar 2003 12:13:25 -0500 (EST)
From: John Baldwin <jhb@FreeBSD.org>
To: Wes Peters <wes@softweyr.com>
X-Spam-Status: No, hits=-19.5 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: Poul-Henning Kamp <phk@phk.freebsd.dk>
cc: freebsd-arch@freebsd.org
Subject: Re: Patch to protect process from pageout killing
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 17:13:33 -0000


On 26-Mar-2003 Wes Peters wrote:
> On Tuesday 25 March 2003 08:34, John Baldwin wrote:
>> On 25-Mar-2003 Wes Peters wrote:
>> > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote:
>> >> Also, doesn't this result in the flag being inerited with fork() and
>> >> thereby negating the effect you are seeking for squid ?
>> >
>> > I looked through all the places in kern_fork.c where p2->p_flag gets
>> > set and didn't see anything that looked like it would inherit
>> > P_PROTECTED from p1->p_flag.  Did I miss something?  I'm obviously a
>> > bit of a neophyte in this part of the kernel.
>>
>> rlimit's are inherited.  However, due to a "feature" bug in your patch,
>> the P_PROTECTED flag doesn't get turned on when the rlimit is inherited
>> in fork1().
> 
> feature bug?  If you mean the fact that the setting for P_PROTECTED isn't 
> stored in the rlimit, that was intentional.  rlimits are inherited and I 
> specifically didn't want that behavior, similar to p_cpulimit.  I still 
> agree resource limits are not an ideal interface to use for this, I'll 
> look further.

I mean that you should be setting P_PROTECTED in fork() based on the
inherited rlimit's since otherwise the value of the rlimit is out of
sync with the P_PROTECTED flag.  Hence a bug.  However, since non-
inheritance is the desired behavior, it is also a feature, hence
"feature" bug.

-- 

John Baldwin <jhb@FreeBSD.org>  <><  http://www.FreeBSD.org/~jhb/
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 09:20:12 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 35C0937B404; Wed, 26 Mar 2003 09:20:12 -0800 (PST)
Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 3210C43F93; Wed, 26 Mar 2003 09:20:11 -0800 (PST)
	(envelope-from scott_long@btc.adaptec.com)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h2QHJIl02754;
	Wed, 26 Mar 2003 09:19:18 -0800
Received: from btc.btc.adaptec.com (btc.btc.adaptec.com [10.100.0.52])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id JAA00985;
	Wed, 26 Mar 2003 09:20:04 -0800 (PST)
Received: from btc.adaptec.com (hollin [10.100.253.56])
	by btc.btc.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id KAA08644;
	Wed, 26 Mar 2003 10:20:00 -0700 (MST)
Message-ID: <3E81E142.3040907@btc.adaptec.com>
Date: Wed, 26 Mar 2003 10:20:02 -0700
From: Scott Long <scott_long@btc.adaptec.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2.1) Gecko/20030206
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Robert Watson <rwaston@freebsd.org>
References: <Pine.NEB.3.96L.1030326091839.3931I-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1030326091839.3931I-100000@fledge.watson.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 17:20:13 -0000

Robert Watson wrote:
> On Tue, 25 Mar 2003, Jeff Roberson wrote:
> 
> 
>>Thanks to the foundation provided by Julian, David Xu, Mini, Dan
>>Eischen, and everyone else who has participated with KSE and libpthread
>>development Mini and I have developed a 1:1 threading implementation. 
>>This code works in parallel with KSE and does not break it in any way. 
>>It actually helps bring M:N threading closer by testing out shared bits. 
> 
> 
> My feeling is that this is an excellent strategy to get us productionable
> kernel-supported threads for the upcoming 5.x release while permitting
> continued R&D (and I think it is R&D) into the M:N threading
> possibilities.  One nice thing about this construction is that the cost
> was very low given the existing investment in KSE, yet the payoff is very
> high.  And it will provide a nice migration path when KSE is
> productionable for sites interested in doing that: thread-reliant
> applications will no longer be explicitly linked against a non-native
> threading package (linuxthreads), which is the status quo for large
> threaded applications on FreeBSD right now.  So it seems to me that a
> relatively straight-forward strategy gets things moving:
>  [...]

I'd like to add a big 'Me too' here also.  1:1 gives us an excellent
milestone towards having KSE work for 5-STABLE.  The June 30 deadline
for KSE has been quickly approaching, and this work achieves all of
the minimum objectives that we were aiming for by that date.  I see
this as a win-win for everyone; application threading is vastly
improved, the existing KSE work gets real-world
testing/exposure/validation, and the M:N work can now procede without
any of the pressure and stress that it had before.

In the spirit that FreeBSD is as much for research as it is for
production, it's important to remember that M:N should be kept around
and a research project through the RELENG_5 branch and into 6-CURRENT.
Once it is stable and proven, we can look at backporting it into
5-STABLE.

Overall, I'm incredibly pleased by this work!  This is a major milestone
for 5-STABLE, and one that will make it a worthwhile branch.

Scott


From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 10:52:25 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3AC9737B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 10:52:25 -0800 (PST)
Received: from net1.gendyn.com (gate1.gendyn.com [204.60.171.22])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 784D943F93
	for <arch@freebsd.org>; Wed, 26 Mar 2003 10:52:24 -0800 (PST)
	(envelope-from eischen@vigrid.com)
Received: from [153.11.11.3] (helo=ebnext01)
	by net1.gendyn.com with esmtp (Exim 2.12 #1)
	id 18yG0T-0000yx-00
	for arch@FreeBSD.org; Wed, 26 Mar 2003 13:52:13 -0500
Received: from clcrtr.gdeb.com ([153.11.109.11])
	by ebnext01  with SMTP id h2QIqAt8022990;
	Wed, 26 Mar 2003 13:52:10 -0500
Received: from vigrid.com (gpz.clc.gdeb.com [192.168.3.12])
	by clcrtr.gdeb.com (8.11.4/8.11.4) with ESMTP id h2O32Bq03378;
	Sun, 23 Mar 2003 22:02:22 -0500 (EST)
	(envelope-from eischen@vigrid.com)
Sender: eghk@clcrtr.gdeb.com
Message-ID: <3E81F6BB.BFFE3F33@vigrid.com>
Date: Wed, 26 Mar 2003 13:51:39 -0500
From: Daniel Eischen <eischen@vigrid.com>
X-Mailer: Mozilla 4.78 [en] (X11; U; SunOS 5.9 sun4u)
X-Accept-Language: en
MIME-Version: 1.0
To: arch@FreeBSD.org
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=0.0 required=5.0
	tests=none
	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: kse@elischer.org
Subject: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 18:52:27 -0000

Is there a good reason for providing static libraries for
libpthread/libkse?  I'd like to not support them to get
rid of some hacks to make sure certain symbols are present
in the static library case.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:06:22 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8D25F37B404
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 11:06:22 -0800 (PST)
Received: from gw.nectar.cc (gw.nectar.cc [208.42.49.153])
	by mx1.FreeBSD.org (Postfix) with ESMTP id F19F643F3F
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 11:06:21 -0800 (PST)
	(envelope-from nectar@celabo.org)
Received: from madman.celabo.org (madman.celabo.org [10.0.1.111])
	by gw.nectar.cc (Postfix) with ESMTP
	id 7214451; Wed, 26 Mar 2003 13:06:21 -0600 (CST)
Received: by madman.celabo.org (Postfix, from userid 1001)
	id 5A75278C43; Wed, 26 Mar 2003 13:06:21 -0600 (CST)
Date: Wed, 26 Mar 2003 13:06:21 -0600
From: "Jacques A. Vidrine" <nectar@FreeBSD.org>
To: Daniel Eischen <eischen@vigrid.com>
Message-ID: <20030326190621.GB34946@madman.celabo.org>
References: <3E81F6BB.BFFE3F33@vigrid.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3E81F6BB.BFFE3F33@vigrid.com>
X-Url: http://www.celabo.org/
User-Agent: Mutt/1.5.3i-ja.1
X-Spam-Status: No, hits=-30.5 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@FreeBSD.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:06:23 -0000

On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote:
> Is there a good reason for providing static libraries for
> libpthread/libkse?  I'd like to not support them to get
> rid of some hacks to make sure certain symbols are present
> in the static library case.

That would make static linking threaded applications impossible, no?

While I wouldn't mind seeing the whole system move to being
dynamically linked, I sure don't feel well about deprecating static
linking completely.  (No threads for static binaries is very close to
`deprecating completely' to me.)

Cheers,
-- 
Jacques A. Vidrine <nectar@celabo.org>          http://www.celabo.org/
NTT/Verio SME          .     FreeBSD UNIX     .       Heimdal Kerberos
jvidrine@verio.net     .  nectar@FreeBSD.org  .          nectar@kth.se

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:11:01 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id EBBC137B404; Wed, 26 Mar 2003 11:11:00 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id F123F43FBD; Wed, 26 Mar 2003 11:10:59 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QJAqBg005457;
	Wed, 26 Mar 2003 14:10:52 -0500 (EST)
Received: from localhost (eischen@localhost)h2QJAqmK005454;
	Wed, 26 Mar 2003 14:10:52 -0500 (EST)
Date: Wed, 26 Mar 2003 14:10:52 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: "Jacques A. Vidrine" <nectar@FreeBSD.org>
In-Reply-To: <20030326190621.GB34946@madman.celabo.org>
Message-ID: <Pine.GSO.4.10.10303261408520.5144-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@FreeBSD.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:11:32 -0000

On Wed, 26 Mar 2003, Jacques A. Vidrine wrote:
> On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote:
> > Is there a good reason for providing static libraries for
> > libpthread/libkse?  I'd like to not support them to get
> > rid of some hacks to make sure certain symbols are present
> > in the static library case.
> 
> That would make static linking threaded applications impossible, no?

Correct.  Solaris does not provide static libthread/libpthread.

> While I wouldn't mind seeing the whole system move to being
> dynamically linked, I sure don't feel well about deprecating static
> linking completely.  (No threads for static binaries is very close to
> `deprecating completely' to me.)

Yup.  That's what I'm advocating.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:36:00 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7C65237B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:36:00 -0800 (PST)
Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E269243F85
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:35:58 -0800 (PST)
	(envelope-from marcel@xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201])
	by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJZOKu025262;
	Wed, 26 Mar 2003 11:35:24 -0800 (PST)
	(envelope-from marcel@piii.pn.xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1])
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJZOBm011460;
	Wed, 26 Mar 2003 11:35:24 -0800 (PST)
	(envelope-from marcel@dhcp01.pn.xcllnt.net)
Received: (from marcel@localhost)
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QJZO77011459;
	Wed, 26 Mar 2003 11:35:24 -0800 (PST)
Date: Wed, 26 Mar 2003 11:35:24 -0800
From: Marcel Moolenaar <marcel@xcllnt.net>
To: Daniel Eischen <eischen@vigrid.com>
Message-ID: <20030326193524.GA11320@dhcp01.pn.xcllnt.net>
References: <3E81F6BB.BFFE3F33@vigrid.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3E81F6BB.BFFE3F33@vigrid.com>
User-Agent: Mutt/1.5.3i
X-Spam-Status: No, hits=-30.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:36:03 -0000

On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote:
> Is there a good reason for providing static libraries for
> libpthread/libkse?  I'd like to not support them to get
> rid of some hacks to make sure certain symbols are present
> in the static library case.

I the maintenance cost is low and the hacks are not in the way
of progress I think we should keep the static libraries. I think
we're throwing something away too carelessly otherwise.

For example, the access sequences generated by compilers for
variables that have the __thread attribute do really suck for
when code is to be generated for dynamic linking. The access
sequences in the static case are superior. The performance
gain is significant if one can build a complete multi-threaded
application.

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:42:29 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6679E37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:42:29 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8D05843F3F
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:42:28 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QJgNBg009862;
	Wed, 26 Mar 2003 14:42:23 -0500 (EST)
Received: from localhost (eischen@localhost)h2QJgM6o009859;
	Wed, 26 Mar 2003 14:42:22 -0500 (EST)
Date: Wed, 26 Mar 2003 14:42:22 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Marcel Moolenaar <marcel@xcllnt.net>
In-Reply-To: <20030326193524.GA11320@dhcp01.pn.xcllnt.net>
Message-ID: <Pine.GSO.4.10.10303261441100.9412-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:42:31 -0000

On Wed, 26 Mar 2003, Marcel Moolenaar wrote:

> On Wed, Mar 26, 2003 at 01:51:39PM -0500, Daniel Eischen wrote:
> > Is there a good reason for providing static libraries for
> > libpthread/libkse?  I'd like to not support them to get
> > rid of some hacks to make sure certain symbols are present
> > in the static library case.
> 
> I the maintenance cost is low and the hacks are not in the way
> of progress I think we should keep the static libraries. I think
> we're throwing something away too carelessly otherwise.
> 
> For example, the access sequences generated by compilers for
> variables that have the __thread attribute do really suck for
> when code is to be generated for dynamic linking. The access
> sequences in the static case are superior. The performance
> gain is significant if one can build a complete multi-threaded
> application.

Solaris and IRIX don't seem to provide static thread
libraries.  Does anyone know if Linux does?

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:48:39 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 1ACF337B404
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 11:48:39 -0800 (PST)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 45A5A43FA3
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 11:48:36 -0800 (PST)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.12.8/8.12.8) with SMTP id h2QJmWjK024258
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 14:48:32 -0500 (EST)
	(envelope-from robert@fledge.watson.org)
Date: Wed, 26 Mar 2003 14:48:31 -0500 (EST)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: arch@FreeBSD.org
Message-ID: <Pine.NEB.3.96L.1030326144501.18064X-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-17.0 required=5.0
	tests=AWL,PATCH_UNIFIED_DIFF,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: M_NOWAIT failure handling -- not so very rosy picture
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:48:40 -0000


I'm running a diskless system with the attached patch; the results have
not been so very pleasing.  I'm collecting a set of panics and traces to
mail out to relevant developers, but it does give one pause.  A related
patch for the mbuf allocator would probably also give some interesting
results.  The patch is far from perfect, but has been enough to result in
some interesting scenarios.

# sysctl debug.malloc_failure_rate=10		# Fail one in ten

Some things to try that I've bumped into so far:

# sysctl -a > /dev/null

# mdconfig -a -s 5m -t malloc
# dd if=/dev/zero of=/dev/md0
# newfs /dev/md0

Both of these seem to be storage-related and I've e-mailed phk about them,
but I suspect there are a lot of others hanging around. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories

Index: kern_malloc.c
===================================================================
RCS file: /home/ncvs/src/sys/kern/kern_malloc.c,v
retrieving revision 1.119
diff -u -r1.119 kern_malloc.c
--- kern_malloc.c	10 Mar 2003 20:24:54 -0000	1.119
+++ kern_malloc.c	26 Mar 2003 19:27:28 -0000
@@ -138,6 +138,20 @@
 /* time_uptime of last malloc(9) failure */
 static time_t t_malloc_fail;
 
+#ifdef MALLOC_MAKE_FAILURES
+/*
+ * Cause malloc failures ever (n) mallocs with M_NOWAIT.  If set to 0,
+ * don't cause failures.
+ */
+static int malloc_failure_rate;
+static int malloc_nowait_count;
+static int malloc_failure_count;
+SYSCTL_INT(_debug, OID_AUTO, malloc_failure_rate, CTLFLAG_RW,
+    &malloc_failure_rate, 0, "Every (n) mallocs with M_NOWAIT will fail");
+SYSCTL_INT(_debug, OID_AUTO, malloc_failure_count, CTLFLAG_RD,
+    &malloc_failure_count, 0, "Number of imposed malloc failures");
+#endif
+
 int
 malloc_last_fail(void)
 {
@@ -187,6 +201,15 @@
 #if 0
 	if (size == 0)
 		Debugger("zero size malloc");
+#endif
+#ifdef MALLOC_MAKE_FAILURES
+	if ((flags & M_NOWAIT) && (malloc_failure_rate != 0)) {
+		atomic_add_int(&malloc_nowait_count, 1);
+		if ((malloc_nowait_count % malloc_failure_rate) == 0) {
+			atomic_add_int(&malloc_failure_count, 1);
+			return (NULL);
+		}
+	}
 #endif
 	if (flags & M_WAITOK)
 		KASSERT(curthread->td_intr_nesting_level == 0,

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:51:11 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 69BE937B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:51:11 -0800 (PST)
Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B229C43F3F
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:51:10 -0800 (PST)
	(envelope-from dan@dan.emsphone.com)
Received: (from dan@localhost)
	by dan.emsphone.com (8.12.7/8.12.7) id h2QJp7ko021588;
	Wed, 26 Mar 2003 13:51:07 -0600 (CST)
	(envelope-from dan)
Date: Wed, 26 Mar 2003 13:51:07 -0600
From: Dan Nelson <dnelson@allantgroup.com>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
Message-ID: <20030326195107.GB31787@dan.emsphone.com>
References: <20030326193524.GA11320@dhcp01.pn.xcllnt.net>
	<Pine.GSO.4.10.10303261441100.9412-100000@pcnet1.pcnet.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.10.10303261441100.9412-100000@pcnet1.pcnet.com>
X-OS: FreeBSD 5.0-CURRENT
X-message-flag: Outlook Error
User-Agent: Mutt/1.5.4i
X-Spam-Status: No, hits=-26.8 required=5.0
	tests=AWL,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Marcel Moolenaar <marcel@xcllnt.net>
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:51:12 -0000

In the last episode (Mar 26), Daniel Eischen said:
> On Wed, 26 Mar 2003, Marcel Moolenaar wrote:
> > For example, the access sequences generated by compilers for
> > variables that have the __thread attribute do really suck for when
> > code is to be generated for dynamic linking. The access sequences
> > in the static case are superior. The performance gain is
> > significant if one can build a complete multi-threaded application.
> 
> Solaris and IRIX don't seem to provide static thread libraries.  Does
> anyone know if Linux does?

Debian provides static versions:
-rw-r--r--    1 root     root   81959 Feb 25 07:46 /lib/libpthread-0.10.so
-rw-r--r--    1 root     root   97286 Feb 25 07:47 /usr/lib/libpthread.a

As does Redhat 7.3:
-rwxr-xr-x    1 root     root       105945 Oct 10 09:51 /lib/libpthread-0.9.so*
-rw-r--r--    1 root     root       118146 Oct 10 09:51 /usr/lib/libpthread.a

-- 
	Dan Nelson
	dnelson@allantgroup.com

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 11:51:25 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E115237B409
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:51:25 -0800 (PST)
Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 924C943FCB
	for <arch@freebsd.org>; Wed, 26 Mar 2003 11:51:23 -0800 (PST)
	(envelope-from marcel@xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201])
	by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJopKu025379;
	Wed, 26 Mar 2003 11:50:51 -0800 (PST)
	(envelope-from marcel@piii.pn.xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1])
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QJopBm011499;
	Wed, 26 Mar 2003 11:50:51 -0800 (PST)
	(envelope-from marcel@dhcp01.pn.xcllnt.net)
Received: (from marcel@localhost)
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QJopmH011498;
	Wed, 26 Mar 2003 11:50:51 -0800 (PST)
Date: Wed, 26 Mar 2003 11:50:51 -0800
From: Marcel Moolenaar <marcel@xcllnt.net>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
Message-ID: <20030326195051.GB11320@dhcp01.pn.xcllnt.net>
References: <20030326193524.GA11320@dhcp01.pn.xcllnt.net>
	<Pine.GSO.4.10.10303261441100.9412-100000@pcnet1.pcnet.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.10.10303261441100.9412-100000@pcnet1.pcnet.com>
User-Agent: Mutt/1.5.3i
X-Spam-Status: No, hits=-30.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 19:51:28 -0000

On Wed, Mar 26, 2003 at 02:42:22PM -0500, Daniel Eischen wrote:
> 
> Solaris and IRIX don't seem to provide static thread
> libraries.  Does anyone know if Linux does?

That's because they have abandoned static libraries completely,
if I'm not mistaken. Since we still link against archive libraries
is general, our decision to drop an archive threads library can
not really be based on that example alone.

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 12:06:34 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 01FD337B411
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:06:34 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id F1B6143F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:05:18 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QK5CBg013418;
	Wed, 26 Mar 2003 15:05:12 -0500 (EST)
Received: from localhost (eischen@localhost)h2QK5BMJ013415;
	Wed, 26 Mar 2003 15:05:11 -0500 (EST)
Date: Wed, 26 Mar 2003 15:05:11 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Marcel Moolenaar <marcel@xcllnt.net>
In-Reply-To: <20030326195051.GB11320@dhcp01.pn.xcllnt.net>
Message-ID: <Pine.GSO.4.10.10303261457140.12205-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 20:06:37 -0000

On Wed, 26 Mar 2003, Marcel Moolenaar wrote:

> On Wed, Mar 26, 2003 at 02:42:22PM -0500, Daniel Eischen wrote:
> > 
> > Solaris and IRIX don't seem to provide static thread
> > libraries.  Does anyone know if Linux does?
> 
> That's because they have abandoned static libraries completely,
> if I'm not mistaken. Since we still link against archive libraries
> is general, our decision to drop an archive threads library can
> not really be based on that example alone.

I don't think that's the case with Solaris.  As of Solaris 9,
there are 40-50 static libraries in /usr/lib:

  gpz [65] $ uname -a
  SunOS gpz 5.9 Generic sun4u sparc SUNW,Ultra-80
  gpz [64] $ ls /usr/lib/lib*.a
  /usr/lib/lib300.a        /usr/lib/libcurses.a     /usr/lib/libnls.a
  /usr/lib/lib300s.a       /usr/lib/libelf.a        /usr/lib/libnsl.a
  /usr/lib/lib4014.a       /usr/lib/libform.a       /usr/lib/libpanel.a
  /usr/lib/lib450.a        /usr/lib/libgen.a        /usr/lib/libpkg.a
  /usr/lib/libTL.a         /usr/lib/libgenIO.a      /usr/lib/libplot.a
  /usr/lib/libadm.a        /usr/lib/libintl.a       /usr/lib/librac.a
  /usr/lib/libadt_jni.a    /usr/lib/libl.a          /usr/lib/librpcsvc.a
  /usr/lib/libbsdmalloc.a  /usr/lib/libldfeature.a  /usr/lib/libsec.a
  /usr/lib/libbsm.a        /usr/lib/libm.a          /usr/lib/libsocket.a
  /usr/lib/libc.a          /usr/lib/libmail.a       /usr/lib/libtermcap.a
  /usr/lib/libc2.a         /usr/lib/libmalloc.a     /usr/lib/libtermlib.a
  /usr/lib/libc2stubs.a    /usr/lib/libmapmalloc.a  /usr/lib/libvolmgt.a
  /usr/lib/libcmd.a        /usr/lib/libmenu.a       /usr/lib/libvt0.a
  /usr/lib/libcrypt.a      /usr/lib/libmp.a         /usr/lib/libw.a
  /usr/lib/libcrypt_i.a    /usr/lib/libnisdb.a      /usr/lib/liby.a


  gpz [68] $ ls /usr/lib/lib*thread*
  /usr/lib/libpthread.so     /usr/lib/libthread.so     /usr/lib/libthread_db.so
  /usr/lib/libpthread.so.1   /usr/lib/libthread.so.1   /usr/lib/libthread_db.so.1

IRIX also doesn't seem to provide static thread libraries.

Just because Solaris and IRIX doesn't mean we shouldn't;
I'm just using those as examples.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 12:30:13 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CF1DC37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:30:13 -0800 (PST)
Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 15C8543F85
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:30:13 -0800 (PST)
	(envelope-from imp@harmony.village.org)
Received: from harmony.village.org (localhost [127.0.0.1])
	by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2QKU6A7089578;
	Wed, 26 Mar 2003 13:30:06 -0700 (MST)
	(envelope-from imp@harmony.village.org)
Message-Id: <200303262030.h2QKU6A7089578@harmony.village.org>
To: Daniel Eischen <eischen@vigrid.com>
In-reply-to: Your message of "Wed, 26 Mar 2003 13:51:39 EST."
		<3E81F6BB.BFFE3F33@vigrid.com> 
References: <3E81F6BB.BFFE3F33@vigrid.com>  
Date: Wed, 26 Mar 2003 13:30:06 -0700
From: Warner Losh <imp@harmony.village.org>
X-Spam-Status: No, hits=-9.9 required=5.0
	tests=IN_REP_TO,REFERENCES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread) 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 20:30:14 -0000

In message <3E81F6BB.BFFE3F33@vigrid.com> Daniel Eischen writes:
: Is there a good reason for providing static libraries for
: libpthread/libkse?  I'd like to not support them to get
: rid of some hacks to make sure certain symbols are present
: in the static library case.

That would be a big hassle for the company I work for.  We have many
static binaries that are threaded and providing a dynamic one has a
performance impact of a few percent.  While we have done dynamic
linking in the past, and have the infrastructure to do so in the
future in our build process, this may cause us problems in the future
if we need to deploy a static binary (which tends to be safer to do
once a long period of time has passed between the generation of the
system and the deployment of the updated binary).

How gross are the hacks?

Warner

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 12:30:57 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 658C637B404; Wed, 26 Mar 2003 12:30:57 -0800 (PST)
Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 6B84543F93; Wed, 26 Mar 2003 12:30:56 -0800 (PST)
	(envelope-from julian@elischer.org)
Received: from interjet.elischer.org
	(12-232-168-4.client.attbi.com[12.232.168.4])
	by sccrmhc02.attbi.com (sccrmhc02) with ESMTP
	id <2003032620305400200jbp34e>; Wed, 26 Mar 2003 20:30:55 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA52593;
	Wed, 26 Mar 2003 12:30:54 -0800 (PST)
Date: Wed, 26 Mar 2003 12:30:52 -0800 (PST)
From: Julian Elischer <julian@elischer.org>
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030326031245.O64602-100000@mail.chesapeake.net>
Message-ID: <Pine.BSF.4.21.0303261154100.52134-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-24.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REPLY_WITH_QUOTES,
	      USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 20:30:59 -0000


On Wed, 26 Mar 2003, Jeff Roberson wrote:
> I don't understand?  There are relatively minor changes to the kernel to
> support this.  Since nice is a property of the process, it makes sense
> that there is only one ksegrp per process.  I'm starting to think that the
> ksegrp was overkill in general.
> 

Instead of making a new KSE for each thread (and thereby blowing out the
code that expects that NKSE <= NCPU per KSEGRP,
allocate each KSE in a new KSEGRP.  The overhead is not that much and
you will keep NKSE/KSEGRP <= NCPU. You will also be able to support 
system scope threads with differnet priorities, which is a 
Posix requirement. It is the equivalent of adding teh "NEWKSEGRP" 
flag  to each thread creation call, and should have no other
ramifications.  It will also mean that you can se this system if people
add a scheduler that doesn't have KSEs (as discussed previously)
(by durectly scheduling threads).

> > Specifically since My plan is to make the "KSE' structure go away..
> > (by which I mean it is only going to be visible within the particular
> > thread_scheduler that uses it and that externally
> > the only structures visible would be :
> > proc, ksegrp(subproc?) thread and upcall.
> 
> For M:N I really think this should be proc, thread, and upcall.
> For 1:1 I only need proc and thread.

For your definition, define "1 thread" as:
"A thread with an attached KSE and KSEGRP"
instead of:
"A thread with an attached KSE"

The logic will be very similar but you will get better functionality
by being able to give different threads different priorities etc.
The KSEGRP structure is small. You will not lose much by doing this..

This is how a 1:1 scheme was envisioned and the items in the different
substructures were distributed to work best in this way..

> 
> > The KSE would be allocated only by a call into the scheduler and is part
> > of the "scheduler specific private data".
> >
> > i.e. on creation of a new process, shced_newproc() is called
> > and a KSE is added in there is the scheduler in question wants to use
> > KSEs. If it doesn't, no KSE would be added, but it's still possible that
> 
> Yes, I think we need more sched hooks here as well.  Having only
> sched_fork() makes things sort of gross.  We'll have to hook this all up
> later.

I'll try get it hooked up "sooner rather than later".
I think you can make 1:1 threads in the current system by doing:

kse_create(mbox, NEWGROUP); where the mbox points to the function you
want to run and a new stack. The function just runs as normal, not
knowing that it is atually a UTS thread. Since it never yields to
another thread (by KSE terms) it never does any upcalls an voila.. 1:1
threads. (I am sugesting that we don't need a new syscall to do this,
or, at most a new entrypoint which ends up calling much of the same
code.)


Ok but htis breaks things for M:N threads as in M:N threads, teh mask
would be stored "per process" (or at most per group) and the mask is the
"logical OR" of all the masks for the threads in the group/process.
Having a mask per thread and not having one for the bigger unit
means that the masks for the threads must be updated regularly 
(maybe at every kernel entry) to be the OR of the masks for ALL THE USER
THREADS, which means that the UTS must do this explicitly.
I'm not thrilled by all the extra work this is going to make for M:N
threads. (Well at least this is my preliminary reading of it.)

> 
> > -       if (p1->p_flag & P_THREADED) {
> > +       if (p1->p_flag & P_THREADED || p1->p_numthreads > 1) {
> >
> > If you are running threads, please set the P_THREADED flag.
> > if you wnat do differentiate between upcalling threads and 1:1
> > threads, please use some auxhilliary flag.
> 
> I'd rather not have a flag.  The > 1 check is used only in places where we
> have to suspend multiple threads or go to single threading etc.  Processes
> in the 1:1 threading model aren't so special as they are with KSE.  They
> don't need to be treated specially except when we're trying to funnel them
> down etc.

Ok, well we'll see with time how it works out and if it is ok, that;s
fine.. If it needs work we can do it then.. this will do for now.

> 
> > You should be creating a new KSEGRP (subproc) per thread.
> > I think you will find that if you do, things will fall out easier
> > and you won't break the next KSE changes.
> 
> I don't understand what I may break?

You are allocating a thread and a KSE..  KSEs may go away (from being
visible to you). If you are referencing them then things will break.

> 
> I don't think M:N is the way to go.  After looking things over and
> considering where it is theoretically faster I do not think it is a
> worthwhile pursuit.

I agree with yo an may ways, and I think that given teh choice I'd run 
KSEs in "P:Q" mode.. (where we don't multiplex any sleeping threads
and have effectively one kernel thread per sleeping thread".

However M:N threads has one advantage. That is where peopel use the
programming model that makes a thread for every object in a program.

This scheme can lead to tens of thousands of small threads in userland.
effectively you do NOT want those to all be kernel threads. Tere are
languages and libraries that promote such programming models.
effectively each object in teh program is an independent intelligent
entity with its own stack and such.. I would like to be able to support
this.


> 
> First off, it is many months away from being even beta quality.  I think
> the UTS is far more complicated than you may realize.  There are all sorts
> of synchronization issues that it was able to avoid before since only one
> thread could run at any time and there essentially was no preemption.  It
> now also has to deal with effecient scheduling decisions in a M:N model
> that it didn't have to worry about before.

I'm not sure that teh issues there are as bad as you think.

> 
> Aside from that, there are numerous problems with the kernel not being
> able to identify individual threads of execution.  Debugging, scheduling,
> profiling, ktrace are all more difficult in a m:n environment.  I think it
> is going to contribute to less effecient scheduling decisions over all.  I
> have already wrestled with this in ULE.

You are right about some parts of this, but it IS possible to do these
things.

> 
> I feel that this is an overwhelming amount of complexity.  Because of this
> it will be buggy.  Sun claims that they still have open tickets on their
> M:N while their new 1:1 implementation is totally bug free.  How long have
> they been doing m:n?  I don't think that with our limited resources we're
> going to be able to do better.

I think that the complexity of the KSE M:N model is a lot less that what
sun did. 

> 
> Furthermore, m:n's basic advantage is less overhead from staying out of
> the kernel.  Also, less per thread resources.  I think this is bogus for a
> couple of reasons.


It is bogus for a particular class of threaded applications and 
true for a particular class of threaded apps.

> 
> First, if your application has more threads than cpus it is written
> incorrectly.

Not neccesarily. that's just one way of looking at threads. Active
component threaded programs use threads as a programming model
(see above) and it is a perfectly valid way of writing a program.
Remember.. "Ours is not to specify how a programmer writes, but to 
allow the programmer to have a s wide a choice as possible about what
he wants to do."

> For people who are doing thread pools instead of event
> driven IO models they will encounter the same overhead with M:N as 1:1.

True.  That is one model of threading..B
In an IO bound app all waiting threads will have kernel contexts so in
effect it approaches 1:1

> I'm not sure what applications are entirely compute and have more threads
> than cpus.  These are the only ones which really theoretically benefit.  I
> don't think our threading model should be designed to optimize poorly
> thought out applications.

As I said. there are people who like this method of programming.
I don't want to have to say "we only support model A of thread
programming, and if you want model B, it'll really suck."
(This is not saying we shouldn;t have a 1:1 library available..
it's a good idea).

 
> Furthermore, the amount of work done per slice has been growing with
> processor speeds.  Slice time is adjusted for user experience and so it
> remains constant.  This means that the constraints are different from when
> this architecture started to come about many (10 or so?) years ago.
> Trying to optimize context switches between threads just doesn't make
> sense when you do so much work per slice.

This is a good argumant, but it still doesn;t make sence to have 10,000
threads all in the kernel.

> 
> Then if you look at the number of system calls and shenanigans a UTS must
> do to make proper scheduling decisions it doesn't look like such an
> advantage.  I feel that the overhead of all the layers comes close to the
> savings from doing some of it without entering the kernel.

So far it's not doing that much..


From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 12:31:27 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D3CD737B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:31:27 -0800 (PST)
Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0E4C543F93
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:31:27 -0800 (PST)
	(envelope-from marcel@xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201])
	by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QKUsKu025573;
	Wed, 26 Mar 2003 12:30:54 -0800 (PST)
	(envelope-from marcel@piii.pn.xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1])
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QKUsBm011609;
	Wed, 26 Mar 2003 12:30:54 -0800 (PST)
	(envelope-from marcel@dhcp01.pn.xcllnt.net)
Received: (from marcel@localhost)
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QKUsbL011608;
	Wed, 26 Mar 2003 12:30:54 -0800 (PST)
Date: Wed, 26 Mar 2003 12:30:54 -0800
From: Marcel Moolenaar <marcel@xcllnt.net>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
Message-ID: <20030326203054.GC11320@dhcp01.pn.xcllnt.net>
References: <20030326195051.GB11320@dhcp01.pn.xcllnt.net>
	<Pine.GSO.4.10.10303261457140.12205-100000@pcnet1.pcnet.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.10.10303261457140.12205-100000@pcnet1.pcnet.com>
User-Agent: Mutt/1.5.3i
X-Spam-Status: No, hits=-30.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 20:31:28 -0000

On Wed, Mar 26, 2003 at 03:05:11PM -0500, Daniel Eischen wrote:
> 
> Just because Solaris and IRIX doesn't mean we shouldn't;
> I'm just using those as examples.

My point really is that if you have good reasons (good reasons for
us) to drop the archive threads library then you should go for it.
Precedence is a good way to make your case, but what applies in
those cases may not apply to us, so what may have been good reasons
for them may not be good reasons for us. Thus, you have to know
(roughly) why they have dropped the archive library if you want to
use them as examples. Just stating that it isn't there may just as
well mean that it hasn't been installed (or bought), not that they
don't have it.

I know HP doesn't have it, but they dropped archive libraries
completely. And as far as I know they followed Sun's example
(as they so often do). Old archive libraries may still be
provided for backward compatibility though...

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 12:55:42 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 98E4C37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:55:42 -0800 (PST)
Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net
	[207.217.120.218])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 656E143F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:55:41 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0166.cvx22-bradley.dialup.earthlink.net ([209.179.198.166]
	helo=mindspring.com)
	by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18yHvf-00054D-00; Wed, 26 Mar 2003 12:55:24 -0800
Message-ID: <3E821365.6B036B0D@mindspring.com>
Date: Wed, 26 Mar 2003 12:53:57 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Jeff Roberson <jroberson@chesapeake.net>
References: <20030326053115.T64602-100000@mail.chesapeake.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4cde4d4298c38105b3802a607afd9c067a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c
X-Spam-Status: No, hits=-21.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,QUOTE_TWICE_1,
	      RCVD_IN_OSIRUSOFT_COM,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 20:55:46 -0000

Jeff Roberson wrote:
> On Wed, 26 Mar 2003, Terry Lambert wrote:
> > Jeff Roberson wrote:
> > > Well, I wasn't doing userland stuff until three days ago.  I think mini
> > > has just been very busy with work.  I suspect that you're going to need
> > > to start doing userland work or find someone to do it if you want to get
> > > it done soon.
> >
> > In theory, the library API will be identical to the pthreads
> > standard, and will require no changes to programs written to
> > that standard.  Most threaded programs these days are written
> > to the standard.  Some threaded programs make invalid assumptions
> > about rescheduling following an involuntary context switch, or
> > ability to make particular blocking calls.
> 
> I'm not sure what API compatibility has to do with anything?

The API is the only thing that matters about userland work,
apart from side effects of assumptions which end up visible
to users of the API, due to it being non-reflexive, of course.

In other words, your suspicion about the userland work is
incorrect.


> > The first will still be a problem (e.g. Netscape's Java/JavaScript
> > GIF rendering engine is probably still serializing requests due to
> > non-thread reentrancy).
> >
> > The second should not be an issue, either with your implementation
> > of the 1:1, or Jon Mini's implemenetation of the N:M model.
> 
> I'm not sure I know what you're talking about.  Blocking calls are either
> handled by an upcall in M:N or by having independent contexts in 1:1.

The current FreeBSD user space pthreads implementation causes
versions of Netscape that do not intentionally serialize GIF
renderings in Java/JavaScript UI's, and which contain multiple
images, to fail catastrophically, if the mouse was moved over
the GIF during loading.

This was true for some implementations of Slashdot, and it was
also true for the InterJet UI, as of version 3.x of WhistleWare
(based on FreeBSD 3.x), until specific code was included in the
UI to delay GIF request processing so as to serialize requests.

The specific problem, which I suspected from symptoms, and later
confirmed by decompiling the Netscape code in question was that
the rendering engine was non thread-reentrant, and was making an
assumption about scheduler behaviour that was not warranted by
anything other than a kernel threading implementation that would
guarantee resumption of the previously preempted thread.  This
assumption was true on Windows, true on Linux, true on Solaris,
but false on FreeBSD and false on MacOS 9.  And, in fact, we
saw UI crashes on FreeBSD and Mac OS 9, which did not occur on
other platforms.

The problem is in the POSIX interface not enforcing against
people making unwarranted assumptions in their code which uses
the POSIX interface.

Therefore, it is possible that some bugs in vendor code will
be revealed by differences in implementation.  The Netscape
4.x GIF renderer is one such piece of code.


For the record, blocking calls can be handled in the 1:1
case through upcalls, as well.  It's an implementation
detail that's irrelevent to the API that gets exposed.


> > > > Wouldn't it have been easier to have one KSEGRP+KSE+thread per user
> > > > thread? Having one ksegrp and many KSEs requires changing the kernel
> > > > code where doing it the other way you could do it without making any
> > > > changes.
> > >
> > > I don't understand?  There are relatively minor changes to the kernel to
> > > support this.  Since nice is a property of the process, it makes sense
> > > that there is only one ksegrp per process.  I'm starting to think that the
> > > ksegrp was overkill in general.
> >
> > The KSEGRP is, effectively, a virtual processor interface, and
> > was/is intended for use by the scheduler to ensure CPU affinity
> > for individual threads, and CPU negaffinity for multiple threads
> > within a process.  In other words, according to the published
> > design documents, it's a scheduler artifact.
> 
> This is the KSE, not the KSE group.  The KSE Group was intended to allow
> multiple groups of threads with different scheduling algorithms or
> different base priorities (nice).

This is *one* of a number of *important* effects of the KSEGRP;
the other effects are subtle, but are no less important.


> > Personally, I've never seen the need for virtual processors, but
> > then I've always advocated "intentional start/intentional migration"
> > for the scheduler model (one of my arguments for a per CPU migration
> > queue, and a push-model, rather than a pull-model for redistribution
> > of an unbalanced load).
> 
> The push model suffers from as much as a one tick latency in any
> migration.  In many cases probably more than that.

You've made this statement before, and then never defended it.

I think you mean that it has a latency of .5 of the average of
*quantum used*, +/- .5, in the worst case scenario.  That happens
when you push a process to another CPU, and it doesn't notice
until the next context switch (which is *rarely* a full quantum,
and is *usually* a fraction of a quantum).

This is because you are incorrectly pushing the process at the
head of the run queue.  This will, indeed, introduce the latency
you suggest (a smaller latency than you imply, BTW, and probably
ignorable, in fact).

HOWEVER.  The *correct* implementation will push the *second from
the head*, and then schedule the head to run.  This will ensure
that the latency is no more than 1 full quantum *for a thread
that would not run for a maximum of 1 full quantum.

In addition, since it is being pushed to the least loaded CPU,
and load is a measurement of number of items pending on the
ready-to-run queue, I would argue that, in fact, the push model
results in *significantly reduced* latency, on average, compared
to the pull model.

In addition, it eliminates all scheduler lock contention in
the common case, which is the non-migration case.


BTW, with a *pull* model, which requires an average of 1.5
lock contentions in order to accomplish a context switch for
an individual CPU in a 2 CPU system, on a 3GHz processor with
a 433MHz front size bus, this equates to ~7 clock cycles worth
of stall barrier per lock contention, or 10.5 clocks per
scheduler lock acquisition *by its own CPU*, even if you
decide not to migrate *anything*.

That *assumes* that all locks are allocated so as to fall on
cache line boundaries, which FreeBSD *FAILS* to do.


> The overhead from locking queues is far out weighed by the
> cases where your cpu is sitting idle due to an unbalanced load.

I suggest your numbers are in error.  Please examine the papers
on this topic in the book:

        Scheduling and Load Balancing in Parallel and Distributed
                Systems
        Behrooz A. Shirazi
        IEEE Computer Society
        ISBN: 0818665874

It is a compilation of IEEE papers on the topic, and contains
dozens of papers with statistics that refute your claims for
shared memory multiprocessors.

And most of these papers assume the clock multiplier is very
small, if it exists at all.  On modern systems, the amount of
overhead in their statistics for locking should be multiplied
considerably, since the clock multiplier is much higher than
it was.


> The latency is one tick because each cpu would have to poll the load of
> other cpus at some interval to discover that it is out of balance.  Or it
> could check if it had more than one process on the run queue, which seems
> a bit silly.  Regardless, you're probably only going to get to make this
> decisions once a tick which means the other cpu(s) can sit idle for at
> least that long.

I have one question: how did you get into this unbalanced
load situation, where you have 100 processes ready to run on
one CPU, and 0 processes ready to run on the remaining 3 CPUs?

I argue that this situation might in fact be an initial state,
but the steady state over time would be to evenly distribute
work, over time.

In other words, your example refers to what's called a "flash
crowd" case -- roughly equivalent to a fork-bomb.  And I have
already stated, at least for the thread creation case, that I
support "intentional start", where the CPU you pick to put an
initial new thread on, is based on the load.

So I do not understand how this situation could arise, other
than in laboratory conditions.


> Consider a buildworld -j8.  You have many processes rapidly stoping and
> starting.  Without a pull a cpu that was very loaded could suddenly end up
> with no running processes and have to idle until the other gave it work.
> This imbalance is likely to go back in forth, I have observed this
> personally when writing ULE.

The rapidity of the start/stop is irrelevent.  The instantaneous
load at the time of the next start *is* relevent.

I would argue that steady-state performance is more important;
further, I would suggest that, in using "make -j#", that you
select "#" to be a factor of 3 or more larger than the number
of available CPUs, to ensure correct hysteresis.

If this still doesn't fix your problem, I would suggest that
the duration of your quantum (lbolt value) is too large,
compared to how long processing actually occurs, before it
hits a blocking sleep call.


> I think you need both push and pull.  The pull satisfies the case where
> you have short lived but rapidly reappearing processes.  The push
> solves more long term load imbalance issues.  If you have, for example,
> many apache processes that are very busy.  no cpu will go idle, so pull is
> ineffective, but they may still be imbalanced.  This is still missing from
> ULE.

I don't think you can implement pull without locking your own
queue in order to access it.  I *know* you can implement push
without locking your own queue, *ever*, and then only deal with
the locking of a per CPU auxillary queue when you decide you
have to migrate a process.  I argue that this should occur only
*rarely*: it is not the common case.

Further, I don't think you can implement both push an pull in
the same implementation, reasonably.

The problem comes down to whether or not you engage in the
examination of another CPUs scheduling queue.  If you do this,
you have to lock, and you end up stalling both CPUs in order
to do this.

This is a factor of 2 multiplation, minimally, on the stall,
and is probably a heck of a lot more, in FreeBSD, since all
the other CPUs are doing the same thing, and you have L1 and
L2 cache flushes and TLB shootdowns, etc., as a result.


> 
> > In a scheduler model where a sheduler *pulls* work, either from
> > another CPU ready-to-run queue, or from a single ready-to-run
> > queue that is global to the system (in either case, requiring
> > locks in the scheduler path, potentially highly contended locks),
> > the idea of a KSEGRP/"virtual processor" is necessary for globally
> > migratable and contendable "bookkeeping" objects.
> 
> They should only be contended when cpus have nothing to do.  A worthwhile
> tradeoff I'd say.

Define "nothing to do".  The cached lock structure gets zapped in
all other processes which have it read-caches, as soon as it's
written by any CPU to acquire the lock.  Minimally, it's reread,
as necessary, from the L2 cache (the last operation is a write to
release the lock).  Worst case, it's main memory, and your stall
goes up by a factor of 4.

It's clear to me that shared memoy SMP systems with large clock
multipliers *must* pretend that they are distinct NUMA CPUs, as
much as possible, in order to avoid stall barriers.

BTW: the pull model does not work for NUMA systems, since the
memory you are attempting to examine is non-local, and a
distributed cache coherency and messaging protocol must be used
to get the data -- if it's available at all.  So FreeBSD SMP is
screwed from ever running on 64 processor SPARC boxes (for
example), if it uses the pull model.

The push model, on the other hand, can message the process into
the queue of the target CPU, using the built-in hardware messaging
mechanism (a cooperative transfer of the image has to happen as a
result of the message, but that particular overhead is largely
avoidable, using state synchronization via swap, and latency can
be further reduced through checkpointing).


> > So in the current scheduler implementations, KSEGRP is necessary;
> > in the 1:1 model, it's necessary, if only to ensure negaffinity
> > (4 CPU system, process with 4 threads, ensure each thread gets its
> > own CPU, and does not migrate away from it).
> 
> You're talking about the KSE again.  I think CPU affinity has little to do
> with the M:N or 1:1 choice except that it is much more difficult to
> achieve CPU affinity when you have to make a multitiered scheduling
> decision.  To get real affinity in M:N you need kse to cpu affinity and
> thread to kse affinity.  You also the need userland thread to kernel
> thread affinity, or at least user land thread to KSE affinity.

What do you think is on the scheduler queue or the wait queue,
if it's not a KSE?  There's no such thing as a thread, distinct
from the context in which it exists.


> > You could also take this idea much further.  Specifically, SVR4
> > flags system calls as "non-blocking", "blocking", and "potentially
> > blocking".  By doing this, they can lazy-bind context creation for
> > blocking operations on "blocking" and "potentially blocking" calls,
> > and avoid it altogether on "non-blocking" and sometimes avoid it on
> > "potentially blocking" calls.
> 
> KSE already does better than this by only creating a new context when you
> actually block.  The upcall mechanism specifically addresses that need.
> This is seperate from what we were discussing above which is allowing the
> scheduler to have a chance to initialize data when a new context is
> created.

The point is that there is "low hanging fruit".

By knowing up front that there is no chance of blocking, you
can play "fast and loose".

It seems to me from watching the -CURRENT code, that people
can't decide if they are grabbing locks to protect data
objects, or locks to protect code paths.  This resolves a
lot of the redundant locking that happens by giving only a
single rule of thumb, and a place where it can be ignored.


> > This can result in a significant overhead savings, if the kernel
> > implementation evolves, but the user space implementation remains
> > fixed.
> >
> > It's good to decouple these things from each other (IMO).
> 
> Which things?

The idea of kernel entrancy, and the continued need for a
context which can be put on a sleep queue vs. put on a
scheduler queue.  That's not distinct in the current
implementation.  In fact, the same list element pointer
in the same structure is used to link both lists.


> > Everyone does this.  Novell did it back in 1993.  Sun's turnstiles
> > are based on the tradeoff between spinning and waiting, and how
> > many times you have to do that before it's worth crossing the
> > protection domain, and blocking.
> 
> I think you mean sun's adaptive mutexes.  The turnstile is just the
> queue that you block on if I'm remembering correctly.  The blocking queue
> I used for umtx is a similar context where the queue migrates among the
> blocking threads.

Yes, adaptive mutexes, sorry.


> > When we did this in 1993 (Novell's implementation was primarily
> > by Dave Hefner, who now works for Microsoft, I believe), we ended
> Any relation to hugh?

He hates that.  8-).


> > > > My only comment is that since mini is supposed to be doing the
> > > > M:N library, isn't this a bit of a distraction?
> > >
> > > I'll let him comment on this.
> >
> > I'll stick my nose in: I think it's a good idea, since TPTB have
> > recently made noises on a couple of FreeBSD lists about "rapidly
> > approaching deadlines for the KSE work".
> >
> > Consider it insurance on your investment, people.
> 
> Yes, it isn't necessarily a KSE replacement.

But maybe it is, and will be for 6 months, or a year, if it uses
the same kernel mechanisms for its implementation.  That's why
Julian's comments about the kernel changes are important.

Note: I'm not saying they aren't actually necessary, only that
they merit discussion.  So far, the justifications you've offered
all revolve around your percieved irrelevancy of KSEGRP seperate
from process, as a container object.

This is true in ULE, as you've implemented it so far, but it's
probably not true, overall.


> > There is also the fact that affinity and quantum are very hard to
> > maintain on a system with a heterogeneous load.  In other words,
> > 1:1 looks good if the only thing you are running is a single
> > multithreaded proces, but looks a *lot* less good when you start
> > running real-world code instead of fictitious benchmarks that
> > try to make your threading look good (e.g. measuring only thread
> > context switches, with no process context switch stall barriers,
> > etc.).
> 
> Yes, I see what you're getting at.  M:N allows you to keep running until
> you've exhausted your whole slice by selecting another thread.  You could
> acomplish this in 1:1 by loaning your slice to the next available thread
> that was bound to the same cpu and force a switch to that.  That's a neat
> idea.  I'll have to look into this for ule.

It's hard to do correctly in the kernel, because the scheduler
that's making the decision has to either support a variable
quantum granularity (I've seen it implemented that way before,
but it's ugly), or it has to try and make "fairness" decisions
that it's not in a position to make.

For example, a thread calls and gives up it's quantum, and then
other threads in the same process run, because you're not out
of quantum, and then the first threads wait condition is
satisfied: who do you schedule first?  The answer has to be a
PTHREAD_SCOPE_PROCESS prioritization policy.  8-(.


> > I can tell you from personal experience with such a model, that
> > it *VASTLY* outperforms a 1:1 kernel threading model, even if you
> > end up running multiple state-machine instances on multiple CPUs.
> > We got more than a 120X increase in NetWare for UNIX, simply by
> > changing the client dispatch streams MUX to dispatch to worker
> > processes instead of threads, in LIFO instead of FIFO order,
> > simply because it ensured that the process pages you cared about
> > were more likely to be in core.
> 
> Yeah, the LIFO trick is widely used.  I believe apache does something of
> this sort.  It's also discussed on the c10k problem page.  I'm not sure
> why you got better perf out of processes than threads though.  This is
> sort of confusing.

I could avoid competing with other processes in the system
for scheduler quantum, and overall scheduler usage, and
system time, as a result, were reduced.

This was partially a result of the "quantum lending" I spoke
of; it was actually called "It's my damn quantum!" in the
presentation we made.  8-).  The idea is that if the system
gives me a quantum to use... it's my damn quantum!  And I
should not have to sacrifice it, merely because I have a
single context out of many that wants to make a call that
would block.  By using this approach, if you are running
heterogeneous processes, using 1/16th of your quantum doesn't
result in you paying a complete context switch overhead for
having all your threads compete with, say, "cron", running
once a second -- if you lose, you pay a full context switch
overhead.

The kernel boundary crossing is also very expensive in SVR4;
FreeBSD has reduced this somewhat, but it's still pretty far
behind Linux, in this regard, so it's not as cheap to switch
threads in kernel space as in user space.  It's not under
Linux, either, but they only every benchmark homogenous threads
in a single application on a relatively quiescent system.  There
are lies, damn lies, and statistics... then, there's benchmarks.


> > 1:1 threading is useful for one thing, and one thing only: SMP
> > scalability of single image processes.  And it's not the best at
> > doing that.
> 
> It's also good at providing extra contexts to block on for IO
> worker threads.

So's AIO, and it works more efficiently.  So does kqueue, for
that matter.


> Furthermore, It's really good at being implemented quickly,
> which is especially important considering that it's 2003 and we
> don't have kernel supported threads...

OK, can't aregue with that one.  It's one of the reasons I
liked that you did your implementation in the first place.

8-).


> > > Furthermore, m:n's basic advantage is less overhead from staying out of
> > > the kernel.
> >
> > No, actually, it's the ability to fully utilize a quantum, and
> > to not have to make a decision between one of your own threads
> > and some other process, off the run queue, when making a decision
> > in the scheduler about what to run next.
> 
> Yeah, I just remembered this bit.  See my answer above.  I think I'll do
> this trick in ULE.

Good luck... it's very hard to do in a kernel scheduler, without
overly complicating things, I'm afraid.


> > > For people who are doing thread pools instead of event driven IO
> > > models they will encounter the same overhead with M:N as 1:1.
> >
> > This is actually false.  In 1:1, your thread competes with all
> > other processes, in order to be the next at the top of the run
> > queue.  Statitically, you are doing more TLB flushes and shootdowns,
> > and more L1 and L2 cache chootdowns, than you would otherwise.
> 
> This is the same argument about using your whole slice eh?

It's the inverse.  It's what gives the lie to most "benchmarks",
and why, if you are running a web server with CGIs, you get
much more terrible performance than your threads people said
youw would get.  8-).


> > Solving this problem without intentional scheduling has been
> > proben to be N-P incomplete: it is not a problem which is
> > solvable in polonomyial time.
> 
> eh? Which problem is NP?

Solving the "Who do I run next to balance saving context switches
vs. fairness?", if you treat each voluntary context switch as a
restart of the timer until the next involuntary context switch.

Even lending is hard, once you get into the timer code and see
the evil things it does to get the lbolt clock, and the timer
optimizations on system call exit.  8-(.  But at least it's
not NP incomplete.  8-).


> > > I'm not sure what applications are entirely compute and have more threads
> > > than cpus.  These are the only ones which really theoretically benefit.  I
> > > don't think our threading model should be designed to optimize poorly
> > > thought out applications.
> >
> > By that argument, threads should not be supported at all... 8-) 8-).
> 
> I meant to say 'entirely compute bound'.  If you just want CPU and no IO
> then you probably only want as many threads as processors.  This is the
> most effecient arrangement.  I'm not arguing against threads although I do
> think they are often abused.

If the intent is optimization, the answer is never threads; that
was my point.  We would be teaching people to build finite state
automata, instead, and managing their own contexts.  I would
even argue that the code you get was better, since it would ensure
all your per session state never ended up in globals.  8-) 8-).


> > But by your same arguments, CPU clock multipliers have grown
> > to the point that memory bus and I/O bus stalls are so
> > expensive that SMP makes no sense.
> 
> I migh agree with you there.

Yeah, they've pissed me off, ever since my 486DX-50 (*not* DX/2-50!).
8-).


> > > Then if you look at the number of system calls and shenanigans a UTS must
> > > do to make proper scheduling decisions it doesn't look like such an
> > > advantage.
[ ... ]
> > I think the kernel boundary crossing overhead, and the fact
> > that, in doing so, you tend to relinquish a significant
> > fraction of remaining quantum (by your own arguments) says
> > that protection domain crossings are to be avoided at all costs.
> 
> Yes, I agree, and without serious tweaking our current M:N significantly
> increases the number of system calls.

Yes.  The signal masking is particular heinous.  I don't know
what to do about it.  8-(.

My gut reaction is "BSD signals"; before all this POSIX crap
turned BSD into SVR3, interrupted system calls restarted by
default.  There's a nice threads package from ~1988 that used
this fact, called "sigsched"; it's in the comp.sources.unix
archives.  Doesn't work any more, unless you call siginterrupt()
and then avoid POSIX signal interfaces.  8-(.

> > I'm glad you pursued it, even though I do not agree with your
> > reasoning on the value of N:M vs. 1:1.  I view it as "life
> > insurance" for the KSE code, which some people might be
> > otherwise tempted to rip out over some arbitrary deadline.
> >
> > Thank you for your work here, and thank everyone else for
> > their work, too.
> 
> Thanks for the feedback.  It has been stimulating.  I still need to
> consider multithreading implications of 1:1 for ULE.  This has given me a
> bit more to work on there.

I wish you had been at the original SMP meetings with Jason Evans,
Matt Dillon, and the 50+ other folks who showed up each time; it
would be a lot easier if everyone had the same context.  8-(.

In the quantum lending, be sure that you look carefully at the
involuntary context switch timer, and when it gets reset.  It's
scary in there.  8-).

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 12:58:13 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8E2B537B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:58:13 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B0EFF43F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 12:58:12 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QKw4Bg023185;
	Wed, 26 Mar 2003 15:58:04 -0500 (EST)
Received: from localhost (eischen@localhost)h2QKw3Ck023180;
	Wed, 26 Mar 2003 15:58:03 -0500 (EST)
Date: Wed, 26 Mar 2003 15:58:03 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Julian Elischer <julian@elischer.org>
In-Reply-To: <Pine.BSF.4.21.0303261154100.52134-100000@InterJet.elischer.org>
Message-ID: <Pine.GSO.4.10.10303261537570.19728-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: kse@elischer.org
cc: arch@freebsd.org
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 20:58:15 -0000

On Wed, 26 Mar 2003, Julian Elischer wrote:
> 
> On Wed, 26 Mar 2003, Jeff Roberson wrote:
> > >
> > > i.e. on creation of a new process, shced_newproc() is called
> > > and a KSE is added in there is the scheduler in question wants to use
> > > KSEs. If it doesn't, no KSE would be added, but it's still possible that
> > 
> > Yes, I think we need more sched hooks here as well.  Having only
> > sched_fork() makes things sort of gross.  We'll have to hook this all up
> > later.
> 
> I'll try get it hooked up "sooner rather than later".
> I think you can make 1:1 threads in the current system by doing:
> 
> kse_create(mbox, NEWGROUP); where the mbox points to the function you
> want to run and a new stack. The function just runs as normal, not
> knowing that it is atually a UTS thread. Since it never yields to
> another thread (by KSE terms) it never does any upcalls an voila.. 1:1
> threads. (I am sugesting that we don't need a new syscall to do this,
> or, at most a new entrypoint which ends up calling much of the same
> code.)

Right.  And if you translate this into the M:N library,
you just create your threads with PTHREAD_SCOPE_SYSTEM.

One of my unvoiced thoughts was that we could add a flag
or two to the KSE mailbox so that a scope system thread
doesn't need a separate stack.  Once one of these KSEs
(thread actually) blocks in the kernel, it stays there,
BUT, it can still awake from kse_thr_interrupt, kse_release,
etc, just that instead of an upcall it just returns
normally from those calls.  In this way, scope system
threads can be very low overhead and not need to enter
the UTS scheduler, yet they can still coexist with
scope process threads.

> Ok but htis breaks things for M:N threads as in M:N threads, teh mask
> would be stored "per process" (or at most per group) and the mask is the
> "logical OR" of all the masks for the threads in the group/process.
> Having a mask per thread and not having one for the bigger unit
> means that the masks for the threads must be updated regularly 
> (maybe at every kernel entry) to be the OR of the masks for ALL THE USER
> THREADS, which means that the UTS must do this explicitly.
> I'm not thrilled by all the extra work this is going to make for M:N
> threads. (Well at least this is my preliminary reading of it.)

No, please don't make the UTS deal with this, if that's the
case.

> > 
> > First off, it is many months away from being even beta quality.  I think
> > the UTS is far more complicated than you may realize.  There are all sorts
> > of synchronization issues that it was able to avoid before since only one
> > thread could run at any time and there essentially was no preemption.  It
> > now also has to deal with effecient scheduling decisions in a M:N model
> > that it didn't have to worry about before.
> 
> I'm not sure that teh issues there are as bad as you think.

I don't think it is as bad as that either.  The complexity
is on par with that of libc_r.

> > 
> > Then if you look at the number of system calls and shenanigans a UTS must
> > do to make proper scheduling decisions it doesn't look like such an
> > advantage.  I feel that the overhead of all the layers comes close to the
> > savings from doing some of it without entering the kernel.
> 
> So far it's not doing that much..

Yeah, I don't understand the above statement either.  It's lower
overhead than libc_r.  The only system calls it should be making
is to kse_release() when it has no more work to do (no runnable
threads) or possibly to kse_thr_wakeup() if it has to dispatch
signals to threads blocked in the kernel.  Time comes from the
mailbox so we don't even need to get the time of day.  The interfaces
were designed so that we _wouldn't_ have much syscall overhead.

-- 
Dan Eischen


From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 13:04:50 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B2E1037B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 13:04:50 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 083A143F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 13:04:50 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QL4hBg024111;
	Wed, 26 Mar 2003 16:04:43 -0500 (EST)
Received: from localhost (eischen@localhost)h2QL4gpQ024108;
	Wed, 26 Mar 2003 16:04:42 -0500 (EST)
Date: Wed, 26 Mar 2003 16:04:42 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Warner Losh <imp@harmony.village.org>
In-Reply-To: <200303262030.h2QKU6A7089578@harmony.village.org>
Message-ID: <Pine.GSO.4.10.10303261558200.19728-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread) 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 21:04:51 -0000

On Wed, 26 Mar 2003, Warner Losh wrote:

> In message <3E81F6BB.BFFE3F33@vigrid.com> Daniel Eischen writes:
> : Is there a good reason for providing static libraries for
> : libpthread/libkse?  I'd like to not support them to get
> : rid of some hacks to make sure certain symbols are present
> : in the static library case.
> 
> That would be a big hassle for the company I work for.  We have many
> static binaries that are threaded and providing a dynamic one has a
> performance impact of a few percent.  While we have done dynamic
> linking in the past, and have the infrastructure to do so in the
> future in our build process, this may cause us problems in the future
> if we need to deploy a static binary (which tends to be safer to do
> once a long period of time has passed between the generation of the
> system and the deployment of the updated binary).
> 
> How gross are the hacks?

See libc_r/uthread/uthread_init.c (references[] and libgcc_references[]).
Also, in a lot of functions, there are:

	if (_thread_initial == NULL)
		_thread_init();

I'd like to be able to get rid of these eventually and perhaps have
some magical way of getting it called automatically when the library
is loaded.  If it was possible, I'm not sure that it would work in
both static and shared.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 13:44:54 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D370237B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 13:44:54 -0800 (PST)
Received: from ns1.xcllnt.net (209-128-86-226.bayarea.net [209.128.86.226])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E026A43F85
	for <arch@freebsd.org>; Wed, 26 Mar 2003 13:44:53 -0800 (PST)
	(envelope-from marcel@xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (dhcp01.pn.xcllnt.net [192.168.4.201])
	by ns1.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QLiLKu025908;
	Wed, 26 Mar 2003 13:44:22 -0800 (PST)
	(envelope-from marcel@piii.pn.xcllnt.net)
Received: from dhcp01.pn.xcllnt.net (localhost [127.0.0.1])
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8) with ESMTP id h2QLiLBm011898;
	Wed, 26 Mar 2003 13:44:21 -0800 (PST)
	(envelope-from marcel@dhcp01.pn.xcllnt.net)
Received: (from marcel@localhost)
	by dhcp01.pn.xcllnt.net (8.12.8/8.12.8/Submit) id h2QLiLxM011897;
	Wed, 26 Mar 2003 13:44:21 -0800 (PST)
Date: Wed, 26 Mar 2003 13:44:21 -0800
From: Marcel Moolenaar <marcel@xcllnt.net>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
Message-ID: <20030326214421.GF11320@dhcp01.pn.xcllnt.net>
References: <200303262030.h2QKU6A7089578@harmony.village.org>
	<Pine.GSO.4.10.10303261558200.19728-100000@pcnet1.pcnet.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.GSO.4.10.10303261558200.19728-100000@pcnet1.pcnet.com>
User-Agent: Mutt/1.5.3i
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Warner Losh <imp@harmony.village.org>
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 21:44:56 -0000

On Wed, Mar 26, 2003 at 04:04:42PM -0500, Daniel Eischen wrote:
> 
> Also, in a lot of functions, there are:
> 
> 	if (_thread_initial == NULL)
> 		_thread_init();
> 
> I'd like to be able to get rid of these eventually and perhaps have
> some magical way of getting it called automatically when the library
> is loaded.

You may be able to piggyback on the C++ static object initialization
by utilizing _init() and _fini(). I don't think archive is different
from shared in that respect for C (ie they both don't have what _init()
provides and have _fini() in terms for atexit()). But it works in both
cases if you add some C++ related magic (See also the .init and .fini
ELF sections).

-- 
 Marcel Moolenaar	  USPA: A-39004		 marcel@xcllnt.net

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 13:48:23 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 940E537B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 13:48:23 -0800 (PST)
Received: from mail1.qc.uunet.ca (mail1.qc.uunet.ca [198.168.54.16])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B771943F3F
	for <arch@freebsd.org>; Wed, 26 Mar 2003 13:48:22 -0800 (PST)
	(envelope-from anarcat@espresso-com.com)
Received: from xtanbul.studio.espresso-com.com ([216.94.147.57])
	by mail1.qc.uunet.ca (8.12.8/8.12.8) with ESMTP id h2QLm9GS019177;
	Wed, 26 Mar 2003 16:48:10 -0500
Received: from anarcat by xtanbul.studio.espresso-com.com with local (Exim
	3.36 #1 (Debian))
	id 18yIkj-0001cu-00; Wed, 26 Mar 2003 16:48:09 -0500
Date: Wed, 26 Mar 2003 16:48:09 -0500
From: The Anarcat <anarcat@anarcat.ath.cx>
To: Dan Nelson <dnelson@allantgroup.com>
Message-ID: <20030326214809.GE488@xtanbul>
References: <20030326193524.GA11320@dhcp01.pn.xcllnt.net>
	<Pine.GSO.4.10.10303261441100.9412-100000@pcnet1.pcnet.com>
	<20030326195107.GB31787@dan.emsphone.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20030326195107.GB31787@dan.emsphone.com>
User-Agent: Mutt/1.5.3i
Sender: The Anarcat <anarcat@xtanbul.FreeBSD.ORG>
X-Spam-Status: No, hits=-32.5 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Daniel Eischen <eischen@pcnet1.pcnet.com>
cc: Marcel Moolenaar <marcel@xcllnt.net>
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 21:48:24 -0000

On mer mar 26, 2003 at 01:51:07 -0600, Dan Nelson wrote:
> In the last episode (Mar 26), Daniel Eischen said:
> > On Wed, 26 Mar 2003, Marcel Moolenaar wrote:
> > > For example, the access sequences generated by compilers for
> > > variables that have the __thread attribute do really suck for when
> > > code is to be generated for dynamic linking. The access sequences
> > > in the static case are superior. The performance gain is
> > > significant if one can build a complete multi-threaded application.
> > 
> > Solaris and IRIX don't seem to provide static thread libraries.  Does
> > anyone know if Linux does?
> 
> Debian provides static versions:
> -rw-r--r--    1 root     root   81959 Feb 25 07:46 /lib/libpthread-0.10.so
> -rw-r--r--    1 root     root   97286 Feb 25 07:47 /usr/lib/libpthread.a

Note that libpthread.a is provided by the libc6-dev package and does
not need to be installed by default, IIRC.

anarcat@xtanbul[/usr/lib]% dpkg-query -S libpthread.a 
libc6-dev: /usr/lib/libpthread.a
anarcat@xtanbul[/usr/lib]% 

Also, this package is not required by most applications. Only when you
install build tools, does the static lib gets installed.

I like the idea of splitting a port's library between static and
shared packages. Most end-users that don't need to compile anything
don't need static libraries. -dev packages also contain the header
files. I'd like to see the same in our ports system.

A.

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 14:02:18 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0AF5037B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 14:02:18 -0800 (PST)
Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4224B43F93
	for <arch@freebsd.org>; Wed, 26 Mar 2003 14:02:17 -0800 (PST)
	(envelope-from imp@bsdimp.com)
Received: from localhost (warner@rover2.village.org [10.0.0.1])
	by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2QM2BA7090076;
	Wed, 26 Mar 2003 15:02:11 -0700 (MST)
	(envelope-from imp@bsdimp.com)
Date: Wed, 26 Mar 2003 15:01:52 -0700 (MST)
Message-Id: <20030326.150152.125002089.imp@bsdimp.com>
To: arch@freebsd.org, kse@elischer.org
From: "M. Warner Losh" <imp@bsdimp.com>
In-Reply-To: <20030326214421.GF11320@dhcp01.pn.xcllnt.net>
References: <200303262030.h2QKU6A7089578@harmony.village.org>
	<Pine.GSO.4.10.10303261558200.19728-100000@pcnet1.pcnet.com>
	<20030326214421.GF11320@dhcp01.pn.xcllnt.net>
X-Mailer: Mew version 2.1 on Emacs 21.2 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-9.9 required=5.0
	tests=AWL,IN_REP_TO,REFERENCES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 22:02:20 -0000

In message: <20030326214421.GF11320@dhcp01.pn.xcllnt.net>
            Marcel Moolenaar <marcel@xcllnt.net> writes:
: On Wed, Mar 26, 2003 at 04:04:42PM -0500, Daniel Eischen wrote:
: > 
: > Also, in a lot of functions, there are:
: > 
: > 	if (_thread_initial == NULL)
: > 		_thread_init();
: > 
: > I'd like to be able to get rid of these eventually and perhaps have
: > some magical way of getting it called automatically when the library
: > is loaded.
: 
: You may be able to piggyback on the C++ static object initialization
: by utilizing _init() and _fini(). I don't think archive is different
: from shared in that respect for C (ie they both don't have what _init()
: provides and have _fini() in terms for atexit()). But it works in both
: cases if you add some C++ related magic (See also the .init and .fini
: ELF sections).

Yes.  I was going to make that same point.  C++ static object init
always happens, static or dynamic.  And has since FreeBSD has
supported ELF...

Warner

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 14:22:01 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CAD4937B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 14:22:01 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 29C0143FB1
	for <arch@freebsd.org>; Wed, 26 Mar 2003 14:22:01 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2QMLoBg004177;
	Wed, 26 Mar 2003 17:21:50 -0500 (EST)
Received: from localhost (eischen@localhost)h2QMLo1n004174;
	Wed, 26 Mar 2003 17:21:50 -0500 (EST)
Date: Wed, 26 Mar 2003 17:21:50 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: "M. Warner Losh" <imp@bsdimp.com>
In-Reply-To: <20030326.150152.125002089.imp@bsdimp.com>
Message-ID: <Pine.GSO.4.10.10303261718210.3735-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Mar 2003 22:22:02 -0000

On Wed, 26 Mar 2003, M. Warner Losh wrote:

> In message: <20030326214421.GF11320@dhcp01.pn.xcllnt.net>
>             Marcel Moolenaar <marcel@xcllnt.net> writes:
> : On Wed, Mar 26, 2003 at 04:04:42PM -0500, Daniel Eischen wrote:
> : > 
> : > Also, in a lot of functions, there are:
> : > 
> : > 	if (_thread_initial == NULL)
> : > 		_thread_init();
> : > 
> : > I'd like to be able to get rid of these eventually and perhaps have
> : > some magical way of getting it called automatically when the library
> : > is loaded.
> : 
> : You may be able to piggyback on the C++ static object initialization
> : by utilizing _init() and _fini(). I don't think archive is different
> : from shared in that respect for C (ie they both don't have what _init()
> : provides and have _fini() in terms for atexit()). But it works in both
> : cases if you add some C++ related magic (See also the .init and .fini
> : ELF sections).
> 
> Yes.  I was going to make that same point.  C++ static object init
> always happens, static or dynamic.  And has since FreeBSD has
> supported ELF...

OK, since there seems to be some objections, I'll withdraw
the proposition.  Other reasons may develop later on, but
I'll shelve the idea for now.

Thanks for everyone's input :)

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 17:04:18 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id BEA1237B404; Wed, 26 Mar 2003 17:04:18 -0800 (PST)
Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id D173F43F93; Wed, 26 Mar 2003 17:04:17 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from salty.rapid.stbernard.com ([192.168.4.61]) by
	mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329);
	Wed, 26 Mar 2003 17:04:17 -0800
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr.com
To: John Baldwin <jhb@FreeBSD.org>
Date: Wed, 26 Mar 2003 17:04:17 -0800
User-Agent: KMail/1.5
References: <XFMail.20030326121325.jhb@FreeBSD.org>
In-Reply-To: <XFMail.20030326121325.jhb@FreeBSD.org>
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to <http://www.habeas.com/report/>.   
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303261704.17095.wes@softweyr.com>
X-OriginalArrivalTime: 27 Mar 2003 01:04:17.0489 (UTC)
	FILETIME=[CA861C10:01C2F3FC]
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: Poul-Henning Kamp <phk@phk.freebsd.dk>
cc: freebsd-arch@freebsd.org
Subject: Re: Patch to protect process from pageout killing
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 01:04:19 -0000

On Wednesday 26 March 2003 09:13, John Baldwin wrote:
> On 26-Mar-2003 Wes Peters wrote:
> > On Tuesday 25 March 2003 08:34, John Baldwin wrote:
> >> On 25-Mar-2003 Wes Peters wrote:
> >> > On Monday 24 March 2003 08:36, Poul-Henning Kamp wrote:
> >> >> Also, doesn't this result in the flag being inerited with
> >> >> fork() and thereby negating the effect you are seeking for
> >> >> squid ?
> >> >
> >> > I looked through all the places in kern_fork.c where p2->p_flag
> >> > gets set and didn't see anything that looked like it would
> >> > inherit P_PROTECTED from p1->p_flag.  Did I miss something?  I'm
> >> > obviously a bit of a neophyte in this part of the kernel.
> >>
> >> rlimit's are inherited.  However, due to a "feature" bug in your
> >> patch, the P_PROTECTED flag doesn't get turned on when the rlimit
> >> is inherited in fork1().
> >
> > feature bug?  If you mean the fact that the setting for P_PROTECTED
> > isn't stored in the rlimit, that was intentional.  rlimits are
> > inherited and I specifically didn't want that behavior, similar to
> > p_cpulimit.  I still agree resource limits are not an ideal
> > interface to use for this, I'll look further.
>
> I mean that you should be setting P_PROTECTED in fork() based on the
> inherited rlimit's since otherwise the value of the rlimit is out of
> sync with the P_PROTECTED flag.  Hence a bug.  However, since non-
> inheritance is the desired behavior, it is also a feature, hence
> "feature" bug.

Ah, actually it would be best to explicitly clear the RLIMIT_PROTECT in 
the rlimit, except the RLIMIT_PROTECT isn't stored in the rlimit.  Eww, 
that was not good.  Problem is, there isn't a generic syscall for 
munging proc items.  As I said, it was a less-than-optimal syscall to 
abuse, I'll go back to pondering madvise(2) or mprotect(2) which almost 
sort of make sense.

-- 
         "Where am I, and what am I doing in this handbasket?"

Wes Peters                                              wes@softweyr.com


From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 17:25:17 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9DB6E37B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 17:25:17 -0800 (PST)
Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com
	[218.97.164.167])
	by mx1.FreeBSD.org (Postfix) with ESMTP id C07ED43F75
	for <arch@freebsd.org>; Wed, 26 Mar 2003 17:25:14 -0800 (PST)
	(envelope-from davidxu@freebsd.org)
Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by
	exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service
	Version 5.5.2650.21)	id HLDQN88H; Thu, 27 Mar 2003 09:11:31 +0800
Message-ID: <006a01c2f3ff$e57cb300$f001a8c0@davidw2k>
From: "David Xu" <davidxu@freebsd.org>
To: "Jeff Roberson" <jroberson@chesapeake.net>
Date: Thu, 27 Mar 2003 09:26:31 +0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4807.1700
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300
X-Spam-Status: No, hits=-6.6 required=5.0
	tests=AWL,QUOTED_EMAIL_TEXT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 01:25:20 -0000

After reading your 1:1 threading code, I think you needn't
hack current KSE code to build your own 1:1 threading code.
Our code allow you to do this, actully, it's my earlier
idea to let 1:1 be implemented in our M:N code base, but never
had told this to julian or others.
if you want to create a thread, you can always call kse_create
syscall(the name should be changed to another, for example
upcall_create), the newly created upcall will be scheduled a
kernel thread on it and return to userland thread stack and
thread function, if you want to implement 1:1, you can always
set kse_mailbox.km_curthread to NULL, this ensures that userland
stack always has fixed association with kernel thread stack,
for me, 1:1 only means userland stack has fixed association with
kernel thread stack, no more. however, code in kse_create should be
adjusted to allow NUPCALLS > NCPUS, this allows 1:1 mode to be
implemented. thr_exit can be implemented to use kse_exit, and maybe
a wrapper to for kse_exit.
For thr_kill, I think we may add an API to allow a kernel
thread be identified by using kse_mailbox pointer or something
similar.
By implementing 1:1 code in current M:N code base, benifit is very
clear, a ksegrp protects time quantum and threads priority, and if
you want to implement a system scope thread(I know pthread has this
requirement), just call kse_create with newgroup parameters is 1,
you will immediately get a system scope thread.
Yes, you may think that KSE progress is slow, but I'd like to think
harder before pushing some not well thinked code into kernel.
At least, when I am thinking about M:N, I am also thinking about
1:1, I guess some people like 1:1, and others like M:N, so more choise =
is good.

David Xu

----- Original Message -----=20
From: "Jeff Roberson" <jroberson@chesapeake.net>
To: <kse@elischer.org>
Sent: Wednesday, March 26, 2003 11:00 AM
Subject: 1:1 threading.


> I just sent a mail to arch@ about a parallel effort that you all may =
be
> interested in.  Please follow up there.
>=20
> Thanks,
> Jeff

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 18:00:30 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D5BA237B404
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 18:00:30 -0800 (PST)
Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com
	[218.97.164.167])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A86D643F3F
	for <arch@FreeBSD.org>; Wed, 26 Mar 2003 18:00:29 -0800 (PST)
	(envelope-from davidxu@freebsd.org)
Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by
	exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service
	Version 5.5.2650.21)	id HLDQN9BF; Thu, 27 Mar 2003 09:46:42 +0800
Message-ID: <001d01c2f404$cffb3ab0$f001a8c0@davidw2k>
From: "David Xu" <davidxu@freebsd.org>
To: "Daniel Eischen" <eischen@vigrid.com>, <arch@FreeBSD.org>
References: <3E81F6BB.BFFE3F33@vigrid.com>
Date: Thu, 27 Mar 2003 10:01:42 +0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4807.1700
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300
X-Spam-Status: No, hits=-8.8 required=5.0
	tests=AWL,QUOTED_EMAIL_TEXT,REFERENCES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: kse@elischer.org
Subject: Re: Not providing static libraries (libkse/libpthread)
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 02:00:31 -0000

I'd like to see everything is dynamically linked and threaded
:-).

David Xu

----- Original Message -----=20
From: "Daniel Eischen" <eischen@vigrid.com>
To: <arch@FreeBSD.org>
Cc: <kse@elischer.org>
Sent: Thursday, March 27, 2003 2:51 AM
Subject: Not providing static libraries (libkse/libpthread)


> Is there a good reason for providing static libraries for
> libpthread/libkse?  I'd like to not support them to get
> rid of some hacks to make sure certain symbols are present
> in the static library case.
>=20
> --=20
> Dan Eischen
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to =
"freebsd-arch-unsubscribe@freebsd.org"

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 23:17:27 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id A76E037B404; Wed, 26 Mar 2003 23:17:27 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 9193B43FA3; Wed, 26 Mar 2003 23:17:26 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2R7HPl97392;
	Thu, 27 Mar 2003 02:17:25 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Thu, 27 Mar 2003 02:17:25 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: David Xu <davidxu@freebsd.org>
In-Reply-To: <006a01c2f3ff$e57cb300$f001a8c0@davidw2k>
Message-ID: <20030327020402.T64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-17.0 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 07:17:29 -0000


On Thu, 27 Mar 2003, David Xu wrote:

> After reading your 1:1 threading code, I think you needn't
> hack current KSE code to build your own 1:1 threading code.
> Our code allow you to do this, actully, it's my earlier
> idea to let 1:1 be implemented in our M:N code base, but never
> had told this to julian or others.

It was actually done outside of KSE on purpose.  It keeps the API simpler
and cleaner.  It keeps the implementation cleaner.  It keeps it out of the
majority of the KSE code paths aside from thread_suspend and related
code.

I wanted something small and stable that built on top of KSE provided
primitives but did not actually use the KSE apis.  This makes it easier
for KSE to continue growing and changing while the 1:1 code remains
simple.  It also removes some of the cost associated with doing KSE.

> if you want to create a thread, you can always call kse_create
> syscall(the name should be changed to another, for example
> upcall_create), the newly created upcall will be scheduled a
> kernel thread on it and return to userland thread stack and
> thread function, if you want to implement 1:1, you can always
> set kse_mailbox.km_curthread to NULL, this ensures that userland
> stack always has fixed association with kernel thread stack,
> for me, 1:1 only means userland stack has fixed association with
> kernel thread stack, no more. however, code in kse_create should be
> adjusted to allow NUPCALLS > NCPUS, this allows 1:1 mode to be
> implemented. thr_exit can be implemented to use kse_exit, and maybe
> a wrapper to for kse_exit.
> For thr_kill, I think we may add an API to allow a kernel
> thread be identified by using kse_mailbox pointer or something
> similar.

I intend to keep thr_kill as is since it is the most simple and direct way
to acomplish the POSIX semantics.

> By implementing 1:1 code in current M:N code base, benifit is very
> clear, a ksegrp protects time quantum and threads priority, and if
> you want to implement a system scope thread(I know pthread has this
> requirement), just call kse_create with newgroup parameters is 1,
> you will immediately get a system scope thread.

I have considered using the ksegrp for this purpose but I view that as an
advanced feature not required to get threading off the ground.  I'm trying
to take reasonable steps that provide functionality all along the way.

> Yes, you may think that KSE progress is slow, but I'd like to think

KSE progress has been far too slow.  Many people are migrating away from
freebsd for other platforms due to our lack of threading.  This project
has been underway for a significant amount of time.  I did the 1:1
threading because I view this as the most reasonable way to get good
threading in a short period of time.

> harder before pushing some not well thinked code into kernel.

It is quite well thought out.  Given the track record of the KSE project I
actually take some offense to the suggestion that my code is not well
thought out in comparison.

> At least, when I am thinking about M:N, I am also thinking about
> 1:1, I guess some people like 1:1, and others like M:N, so more choise is good.
>
> David Xu

Yes, I think they both have their place.  I think we can support both in
the tree as well.  Hopefully the 1:1 code will address most users needs
until M:N is production ready.

Cheers,
Jeff

> ----- Original Message -----
> From: "Jeff Roberson" <jroberson@chesapeake.net>
> To: <kse@elischer.org>
> Sent: Wednesday, March 26, 2003 11:00 AM
> Subject: 1:1 threading.
>
>
> > I just sent a mail to arch@ about a parallel effort that you all may be
> > interested in.  Please follow up there.
> >
> > Thanks,
> > Jeff
>

From owner-freebsd-arch@FreeBSD.ORG  Wed Mar 26 23:43:55 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7756937B404
	for <arch@freebsd.org>; Wed, 26 Mar 2003 23:43:55 -0800 (PST)
Received: from cirb503493.alcatel.com.au (c18609.belrs1.nsw.optusnet.com.au
	[210.49.80.204])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 37BBA43F85
	for <arch@freebsd.org>; Wed, 26 Mar 2003 23:43:54 -0800 (PST)
	(envelope-from peterjeremy@optushome.com.au)
Received: from cirb503493.alcatel.com.au (localhost.alcatel.com.au
	[127.0.0.1])h2R7hEM2019016;	Thu, 27 Mar 2003 18:43:15 +1100 (EST)
	(envelope-from jeremyp@cirb503493.alcatel.com.au)
Received: (from jeremyp@localhost)
	by cirb503493.alcatel.com.au (8.12.8/8.12.8/Submit) id h2R7hBXb019015;
	Thu, 27 Mar 2003 18:43:11 +1100 (EST)
Date: Thu, 27 Mar 2003 18:43:11 +1100
From: Peter Jeremy <peterjeremy@optushome.com.au>
To: Jeff Roberson <jroberson@chesapeake.net>
Message-ID: <20030327074311.GB18940@cirb503493.alcatel.com.au>
References: <20030326031245.O64602-100000@mail.chesapeake.net>
	<Pine.BSF.4.21.0303261154100.52134-100000@InterJet.elischer.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <Pine.BSF.4.21.0303261154100.52134-100000@InterJet.elischer.org>
User-Agent: Mutt/1.4.1i
X-Spam-Status: No, hits=-30.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: kse@elischer.org
cc: Julian Elischer <julian@elischer.org>
Subject: Re: 1:1 Threading implementation.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 07:43:58 -0000

On Wed, Mar 26, 2003 at 12:30:52PM -0800, Julian Elischer wrote:
>On Wed, 26 Mar 2003, Jeff Roberson wrote:
>> First, if your application has more threads than cpus it is written
>> incorrectly.
>
>Not neccesarily. that's just one way of looking at threads. Active
>component threaded programs use threads as a programming model
>(see above) and it is a perfectly valid way of writing a program.

I'd go so far as to say that the only case where relating real CPUs
and threads matters is for compute-bound processes where the only
purpose of threading is to get >100% CPU.

If you consider an arbitrary server/daemon process, there are a
limited number of basic mechanisms you can use to handle more than
one client:
1) One (single-threaded) process per client (eg telnetd, sshd)
2) One process with one thread per client (possibly per direction)
3) One process explicitly using select()[*] to support multiple clients.

Each approach has its own advantages and disadvantages and each
approach requires different support code to handle new clients and
switching between clients.

Obviously, you can combine the approaches but this means you have the
support infrastructure for both basic mechanisms as well as additional
code to decide which mechanism to use.  Apache is a combination of 1
and 3 - but needs a process dedicated to distributing incoming requests.

In general, if you're going to go the effort of threading your server,
why go to the additional effort of adding a select() handler in each
thread?  The big advantage of 1 and 2 is that the core is very simple:
	while (!eof(input)) {
		read input
		do some processing
		write output
	}
whereas the core of 3 requires building and testing FD sets and
making sure that you only block in the select().  This generally
makes the code far less clear.  You can also potentially reduce
the overall throughput because there are multiple scheduling layers.

[*] For "select()", read "select() or poll() or kqueue()"

Peter

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 00:35:49 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id E695C37B401; Thu, 27 Mar 2003 00:35:49 -0800 (PST)
Received: from heron.mail.pas.earthlink.net (heron.mail.pas.earthlink.net
	[207.217.120.189])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 42B4C43FBF; Thu, 27 Mar 2003 00:35:49 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0033.cvx22-bradley.dialup.earthlink.net ([209.179.198.33]
	helo=mindspring.com)
	by heron.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18ySrR-0006tb-00; Thu, 27 Mar 2003 00:35:46 -0800
Message-ID: <3E82B795.DDB0C6A4@mindspring.com>
Date: Thu, 27 Mar 2003 00:34:29 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Jeff Roberson <jroberson@chesapeake.net>
References: <20030327020402.T64602-100000@mail.chesapeake.net>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a48c4399f109637f76819f1b3d16d0a6c2a8438e0f32a48e08350badd9bab72f9c350badd9bab72f9c
X-Spam-Status: No, hits=-21.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,QUOTED_EMAIL_TEXT,QUOTE_TWICE_1,
	      RCVD_IN_OSIRUSOFT_COM,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 08:36:01 -0000

Jeff Roberson wrote:
> On Thu, 27 Mar 2003, David Xu wrote:
> > After reading your 1:1 threading code, I think you needn't
> > hack current KSE code to build your own 1:1 threading code.
> > Our code allow you to do this, actully, it's my earlier
> > idea to let 1:1 be implemented in our M:N code base, but never
> > had told this to julian or others.
> 
> It was actually done outside of KSE on purpose.  It keeps the API simpler
> and cleaner.  It keeps the implementation cleaner.  It keeps it out of the
> majority of the KSE code paths aside from thread_suspend and related
> code.
> 
> I wanted something small and stable that built on top of KSE provided
> primitives but did not actually use the KSE apis.  This makes it easier
> for KSE to continue growing and changing while the 1:1 code remains
> simple.  It also removes some of the cost associated with doing KSE.

This isn't really a legitimate argument.

Specifically, if the primitives are incapable of supporting your
model, then the primitives need to be changed.

The main problem that needs to be overcome, in most cases, is that
historical designs preclude future work.

In this case, your code represents "future work" unanticipated by
the previous design.

Note that I do not necessarily take this position myself; I think
that your work could have proceedded with the KSEGRP per KSE
approach, as Julian and David have suggested, rather than the
single KSEGRP per process approach you chose.


> I intend to keep thr_kill as is since it is the most simple and direct way
> to acomplish the POSIX semantics.

"POSIX semantics" should always be considered a secondary
consideration, since the real intent is to allow *all* semantics
as a construction of the available semantics.

This is incredibly difficult, I know, but it's why smart people
win over stupid people, or average people, when it comes to
design issues.

We need to take the "best of breed" forward, and exclude the
rest (a genetic algorithm, but the best we can approximate at
this time).


> I have considered using the ksegrp for this purpose but I view that as an
> advanced feature not required to get threading off the ground.  I'm trying
> to take reasonable steps that provide functionality all along the way.

This is expediency.  Expediency really has no place in Open Source
design, since it doesn't really consider the consumers at all, it
considers (or is supposed to consider) only the problem space we
are talking about itself.  That really changes at the whim of
public opinion.


> > Yes, you may think that KSE progress is slow, but I'd like to think
> 
> KSE progress has been far too slow.

Yes.  This has to do with inefficiencies of mapping volunteerism
to what is considered (by some people) as "the right way".

There's really no approach to resolving this (right now) other
than "let the best implementation win".

If I'm only willing to work on what I consider "more ideal" to
mapping the problem space, then your volunteers are not willing
to be "managed" into specific implementation details.

--

Hopefully, in the future, this will change: we all want to live
in an ideal world.  At that point, it comes down to games theory,
in terms of communicating goals, and information theory, in terms
of communicating with other people about the desirability of
specific goals.

It's easy to cast this in terms of "war games", "mutual security
games", or "present politics of public opinon".  All you need is
a meta-perspective on the problems.

Personally, I'm happy to see forward progress without conflict;
in the present tense, this means that the KSE work gets used,
even if the use is not in line with the eventual design goals.

If this takes a 1:1 implementation to keep it from being diked
out, I don't care: it's a "drunkards walk" toward the final goal,
and so people should shut up about it, since, no matter how you
look at it, there's a net positive value to the work.

As such, I would like the KSEGRP per thread, instead of the
KSEGRP per process code to go forward.

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 01:17:09 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8C94637B438
	for <arch@freebsd.org>; Thu, 27 Mar 2003 01:17:06 -0800 (PST)
Received: from mail.nsu.ru (mx.nsu.ru [193.124.215.71])
	by mx1.FreeBSD.org (Postfix) with ESMTP id EB90A440E7
	for <arch@freebsd.org>; Thu, 27 Mar 2003 01:05:20 -0800 (PST)
	(envelope-from fjoe@iclub.nsu.ru)
Received: from drweb by mail.nsu.ru with drweb-scanned (Exim 3.20 #1)
	id 18yTJY-0008Up-00; Thu, 27 Mar 2003 15:04:48 +0600
Received: from iclub.nsu.ru ([193.124.215.97] ident=root)
	by mail.nsu.ru with esmtp (Exim 3.20 #1)
	id 18yTJW-0008SZ-00; Thu, 27 Mar 2003 15:04:46 +0600
Received: from iclub.nsu.ru (fjoe@localhost [127.0.0.1])
	by iclub.nsu.ru (8.12.8/8.12.8) with ESMTP id h2R93Hj1009140;
	Thu, 27 Mar 2003 15:03:17 +0600 (NS)
	(envelope-from fjoe@iclub.nsu.ru)
Received: (from fjoe@localhost)
	by iclub.nsu.ru (8.12.8/8.12.8/Submit) id h2R93Eeo009136;
	Thu, 27 Mar 2003 15:03:15 +0600 (NS)
Date: Thu, 27 Mar 2003 15:03:14 +0600
From: Max Khon <fjoe@iclub.nsu.ru>
To: Terry Lambert <tlambert2@mindspring.com>
Message-ID: <20030327150313.A8897@iclub.nsu.ru>
References: <20030327020402.T64602-100000@mail.chesapeake.net>
	<3E82B795.DDB0C6A4@mindspring.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <3E82B795.DDB0C6A4@mindspring.com>;
	from tlambert2@mindspring.com on Thu, Mar 27, 2003 at 12:34:29AM -0800
X-Envelope-To: tlambert2@mindspring.com,
 jroberson@chesapeake.net,
 arch@freebsd.org
X-Spam-Status: No, hits=-33.1 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MUTT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 09:17:22 -0000

hi, there!

On Thu, Mar 27, 2003 at 12:34:29AM -0800, Terry Lambert wrote:

> > > After reading your 1:1 threading code, I think you needn't
> > > hack current KSE code to build your own 1:1 threading code.
> > > Our code allow you to do this, actully, it's my earlier
> > > idea to let 1:1 be implemented in our M:N code base, but never
> > > had told this to julian or others.
> > 
> > It was actually done outside of KSE on purpose.  It keeps the API simpler
> > and cleaner.  It keeps the implementation cleaner.  It keeps it out of the
> > majority of the KSE code paths aside from thread_suspend and related
> > code.
> > 
> > I wanted something small and stable that built on top of KSE provided
> > primitives but did not actually use the KSE apis.  This makes it easier
> > for KSE to continue growing and changing while the 1:1 code remains
> > simple.  It also removes some of the cost associated with doing KSE.
> 
> This isn't really a legitimate argument.

Seconded. do you have numbers that clearly show that using Julian's approach
leads to serious performance penalty? Using KSE APIs is not that difficult
as far as I understand, so why we need to introduce more hacks?

/fjoe

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 02:24:17 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BD5CB37B404
	for <arch@freebsd.org>; Thu, 27 Mar 2003 02:24:17 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0586B43FB1
	for <arch@freebsd.org>; Thu, 27 Mar 2003 02:24:17 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2RAOGZ78047
	for <arch@freebsd.org>; Thu, 27 Mar 2003 05:24:16 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Thu, 27 Mar 2003 05:24:16 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: arch@freebsd.org
Message-ID: <20030327052055.X64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-7.8 required=5.0
	tests=AWL
	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Threading code review please.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 10:24:18 -0000

I'm going to reply to the threads on 1:1 vs M:N tomorrow.  I'd like to
request that people actually read the patch and give me feedback on the
code and not the approach.

I have no outstanding behavior problems with mozilla.  It actually runs
much faster now with libthr in place of libc_r.  On pages with LOTS of
images it scrolls much smoother.  I suspect its the amount of io waits.

Anyway, since this is coming together so well I'd like to get it commited
soon so people can start giving me bug reports.  I have another full days
worth of work to clear up the issues that I know of.  I'll probably post
the library source tomorrow.

Cheers,
Jeff

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 05:09:13 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 027A537B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 05:09:13 -0800 (PST)
Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 5540943FCB
	for <arch@freebsd.org>; Thu, 27 Mar 2003 05:09:10 -0800 (PST)
	(envelope-from dcs@tcoip.com.br)
Received: from tcoip.com.br ([10.0.2.6])
	by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2RD93911387;
	Thu, 27 Mar 2003 10:09:03 -0300
Message-ID: <3E82F7EE.6080802@tcoip.com.br>
Date: Thu, 27 Mar 2003 10:09:02 -0300
From: "Daniel C. Sobral" <dcs@tcoip.com.br>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326
X-Accept-Language: en-us, en, pt-br, ja
MIME-Version: 1.0
To: Jeff Roberson <jroberson@chesapeake.net>
References: <20030327052055.X64602-100000@mail.chesapeake.net>
In-Reply-To: <20030327052055.X64602-100000@mail.chesapeake.net>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: Threading code review please.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 13:09:14 -0000

Jeff Roberson wrote:
> I'm going to reply to the threads on 1:1 vs M:N tomorrow.  I'd like to
> request that people actually read the patch and give me feedback on the
> code and not the approach.
> 
> I have no outstanding behavior problems with mozilla.  It actually runs
> much faster now with libthr in place of libc_r.  On pages with LOTS of
> images it scrolls much smoother.  I suspect its the amount of io waits.

This is an SMP system you are talking about?

-- 
Daniel C. Sobral                   (8-DCS)
Gerencia de Operacoes
Divisao de Comunicacao de Dados
Coordenacao de Seguranca
TCO
Fones: 55-61-313-7654/Cel: 55-61-9618-0904
E-mail: Daniel.Capo@tco.net.br
         Daniel.Sobral@tcoip.com.br
         dcs@tcoip.com.br

Outros:
	dcs@newsguy.com
	dcs@freebsd.org
	capo@notorious.bsdconspiracy.net

"I'd love to go out with you, but the last time I went out, I never
came back."

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 05:13:42 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 02FF737B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 05:13:42 -0800 (PST)
Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP id DC0F843FA3
	for <arch@freebsd.org>; Thu, 27 Mar 2003 05:13:38 -0800 (PST)
	(envelope-from dcs@tcoip.com.br)
Received: from tcoip.com.br ([10.0.2.6])
	by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2RD7i911343;
	Thu, 27 Mar 2003 10:07:44 -0300
Message-ID: <3E82F7A0.2020604@tcoip.com.br>
Date: Thu, 27 Mar 2003 10:07:44 -0300
From: "Daniel C. Sobral" <dcs@tcoip.com.br>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326
X-Accept-Language: en-us, en, pt-br, ja
MIME-Version: 1.0
To: Max Khon <fjoe@iclub.nsu.ru>
References: <20030327020402.T64602-100000@mail.chesapeake.net>
	<3E82B795.DDB0C6A4@mindspring.com> <20030327150313.A8897@iclub.nsu.ru>
In-Reply-To: <20030327150313.A8897@iclub.nsu.ru>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 13:13:43 -0000

Max Khon wrote:
> hi, there!
> 
> On Thu, Mar 27, 2003 at 12:34:29AM -0800, Terry Lambert wrote:
> 
> 
>>>>After reading your 1:1 threading code, I think you needn't
>>>>hack current KSE code to build your own 1:1 threading code.
>>>>Our code allow you to do this, actully, it's my earlier
>>>>idea to let 1:1 be implemented in our M:N code base, but never
>>>>had told this to julian or others.
>>>
>>>It was actually done outside of KSE on purpose.  It keeps the API simpler
>>>and cleaner.  It keeps the implementation cleaner.  It keeps it out of the
>>>majority of the KSE code paths aside from thread_suspend and related
>>>code.
>>>
>>>I wanted something small and stable that built on top of KSE provided
>>>primitives but did not actually use the KSE apis.  This makes it easier
>>>for KSE to continue growing and changing while the 1:1 code remains
>>>simple.  It also removes some of the cost associated with doing KSE.
>>
>>This isn't really a legitimate argument.
> 
> 
> Seconded. do you have numbers that clearly show that using Julian's approach
> leads to serious performance penalty? Using KSE APIs is not that difficult
> as far as I understand, so why we need to introduce more hacks?

As much as I'd prefer the 1:1 threading to use as much of the KSE code 
as possible, Jeff's decision wasn't related to performance issues.

What Jeff wanted to do is to _avoid_ using as much of the KSE API as 
possible so his code wouldn't get in the way of that API, with two 
obvious benefits:

1) Changes to that API (and there have been some in the past) won't 
affect his 1:1 threading code and, thus, won't upset real applications 
using that threading.

2) His 1:1 threading code won't slow down further KSE development nor 
influence any changes to the KSE API.

The reason I personally prefer otherwise is so that (1) above won't be 
true. Ie, any bugs or performance issues introduced in the KSE code 
*will* affect real applications, so that they can be detected and fixed.

-- 
Daniel C. Sobral                   (8-DCS)
Gerencia de Operacoes
Divisao de Comunicacao de Dados
Coordenacao de Seguranca
TCO
Fones: 55-61-313-7654/Cel: 55-61-9618-0904
E-mail: Daniel.Capo@tco.net.br
         Daniel.Sobral@tcoip.com.br
         dcs@tcoip.com.br

Outros:
	dcs@newsguy.com
	dcs@freebsd.org
	capo@notorious.bsdconspiracy.net

A lady stockholder quite hetera
Decided her fortune to bettera:
	On the floor, quite unclad,
	She successively had
Merrill Lynch, Pierce, Fenner, et cetera...

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 06:05:34 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id E08A137B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 06:05:34 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 3595443FBD
	for <arch@freebsd.org>; Thu, 27 Mar 2003 06:05:34 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RE5VBg004917;
	Thu, 27 Mar 2003 09:05:31 -0500 (EST)
Received: from localhost (eischen@localhost)h2RE5UFB004914;
	Thu, 27 Mar 2003 09:05:30 -0500 (EST)
Date: Thu, 27 Mar 2003 09:05:30 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030327052055.X64602-100000@mail.chesapeake.net>
Message-ID: <Pine.GSO.4.10.10303270904390.2224-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: Threading code review please.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 14:05:35 -0000

On Thu, 27 Mar 2003, Jeff Roberson wrote:

> I'm going to reply to the threads on 1:1 vs M:N tomorrow.  I'd like to
> request that people actually read the patch and give me feedback on the
> code and not the approach.

As was said by others, I think you can do what you want with the
existing APIs.  I don't see a need for adding more.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 07:04:00 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 36D6137B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 07:04:00 -0800 (PST)
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 68F1F43FAF
	for <arch@freebsd.org>; Thu, 27 Mar 2003 07:03:59 -0800 (PST)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.12.8/8.12.8) with SMTP id h2RF3ojK066871;
	Thu, 27 Mar 2003 10:03:50 -0500 (EST)
	(envelope-from robert@fledge.watson.org)
Date: Thu, 27 Mar 2003 10:03:50 -0500 (EST)
From: Robert Watson <rwatson@freebsd.org>
X-Sender: robert@fledge.watson.org
To: "Daniel C. Sobral" <dcs@tcoip.com.br>
In-Reply-To: <3E82F7EE.6080802@tcoip.com.br>
Message-ID: <Pine.NEB.3.96L.1030327100013.37107G-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-23.5 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: Threading code review please.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 15:04:01 -0000


On Thu, 27 Mar 2003, Daniel C. Sobral wrote:

> Jeff Roberson wrote:
> > I'm going to reply to the threads on 1:1 vs M:N tomorrow.  I'd like to
> > request that people actually read the patch and give me feedback on the
> > code and not the approach.
> > 
> > I have no outstanding behavior problems with mozilla.  It actually runs
> > much faster now with libthr in place of libc_r.  On pages with LOTS of
> > images it scrolls much smoother.  I suspect its the amount of io waits.
> 
> This is an SMP system you are talking about? 

Both 1:1 and M:N threading will improve performance of interactive
applications if they spend any moderate amount of time I/O bound.  I've
noticed substantial performance differences between instances of
openoffice linked for libc_r and openoffice linked for linuxthreads --
serializing I/O operations substantially impacts throughput and
interactivty due to latency.  Try running the Linux-linked mozilla, the
FreeBSD libc_r mozilla, and the FreeBSD linuxthreads mozilla and see how
they compare. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Projects
robert@fledge.watson.org      Network Associates Laboratories


From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 07:30:09 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id F258437B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 07:30:08 -0800 (PST)
Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4B4CC43FDD
	for <arch@freebsd.org>; Thu, 27 Mar 2003 07:30:08 -0800 (PST)
	(envelope-from scott_long@btc.adaptec.com)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h2RFSKl28011;
	Thu, 27 Mar 2003 07:28:20 -0800
Received: from btc.btc.adaptec.com (btc.btc.adaptec.com [10.100.0.52])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id HAA01187;
	Thu, 27 Mar 2003 07:28:59 -0800 (PST)
Received: from btc.adaptec.com (hollin [10.100.253.56])
	by btc.btc.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id IAA09159;
	Thu, 27 Mar 2003 08:28:51 -0700 (MST)
Message-ID: <3E8318B3.2020801@btc.adaptec.com>
Date: Thu, 27 Mar 2003 08:28:51 -0700
From: Scott Long <scott_long@btc.adaptec.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2.1) Gecko/20030206
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: "Daniel C. Sobral" <dcs@tcoip.com.br>
References: <20030327020402.T64602-100000@mail.chesapeake.net>
	<3E82B795.DDB0C6A4@mindspring.com> <20030327150313.A8897@iclub.nsu.ru>
	<3E82F7A0.2020604@tcoip.com.br>
In-Reply-To: <3E82F7A0.2020604@tcoip.com.br>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 15:30:10 -0000

Daniel C. Sobral wrote:
> Max Khon wrote:
> 
>> hi, there!
>>
>> On Thu, Mar 27, 2003 at 12:34:29AM -0800, Terry Lambert wrote:
>>
>>
>>>>> After reading your 1:1 threading code, I think you needn't
>>>>> hack current KSE code to build your own 1:1 threading code.
>>>>> Our code allow you to do this, actully, it's my earlier
>>>>> idea to let 1:1 be implemented in our M:N code base, but never
>>>>> had told this to julian or others.
>>>>
>>>>
>>>> It was actually done outside of KSE on purpose.  It keeps the API 
>>>> simpler
>>>> and cleaner.  It keeps the implementation cleaner.  It keeps it out 
>>>> of the
>>>> majority of the KSE code paths aside from thread_suspend and related
>>>> code.
>>>>
>>>> I wanted something small and stable that built on top of KSE provided
>>>> primitives but did not actually use the KSE apis.  This makes it easier
>>>> for KSE to continue growing and changing while the 1:1 code remains
>>>> simple.  It also removes some of the cost associated with doing KSE.
>>>
>>>
>>> This isn't really a legitimate argument.
>>
>>
>>
>> Seconded. do you have numbers that clearly show that using Julian's 
>> approach
>> leads to serious performance penalty? Using KSE APIs is not that 
>> difficult
>> as far as I understand, so why we need to introduce more hacks?
> 
> 
> As much as I'd prefer the 1:1 threading to use as much of the KSE code 
> as possible, Jeff's decision wasn't related to performance issues.
> 
> What Jeff wanted to do is to _avoid_ using as much of the KSE API as 
> possible so his code wouldn't get in the way of that API, with two 
> obvious benefits:
> 
> 1) Changes to that API (and there have been some in the past) won't 
> affect his 1:1 threading code and, thus, won't upset real applications 
> using that threading.
> 
> 2) His 1:1 threading code won't slow down further KSE development nor 
> influence any changes to the KSE API.
> 
> The reason I personally prefer otherwise is so that (1) above won't be 
> true. Ie, any bugs or performance issues introduced in the KSE code 
> *will* affect real applications, so that they can be detected and fixed.
> 

Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE
development.  By keeping the 1:1 and M:N API's separate, KSE can
progress in 6-CURRENT until it is proven while still allowing MFC's to
5-STABLE to happen without too much pain.  Later on down the road when
KSE matures, or when we decide that 1:1 should really just be a special
case of M:N, we can look at addressing the above concerns and possibly
MFC'ing the results back to 5-STABLE.  But for now we need to allow for
5-STABLE to actually be usable and maintainable.

Scott

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 08:12:15 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 390C137B404
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:12:15 -0800 (PST)
Received: from h132-197-179-27.gte.com (h132-197-179-27.gte.com
	[132.197.179.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0F2AF43F93
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:12:14 -0800 (PST)
	(envelope-from ak03@gte.com)
Received: from kanpc.gte.com (ak03@localhost [127.0.0.1])
	h2RGCCAi036101;	Thu, 27 Mar 2003 11:12:12 -0500 (EST)
	(envelope-from ak03@kanpc.gte.com)
Received: (from ak03@localhost)
	by kanpc.gte.com (8.12.8/8.12.8/Submit) id h2RGCCXq036100;
	Thu, 27 Mar 2003 11:12:12 -0500 (EST)
Date: Thu, 27 Mar 2003 11:12:12 -0500
From: Alexander Kabaev <ak03@gte.com>
To: Max Khon <fjoe@iclub.nsu.ru>
Message-Id: <20030327111212.13029dbf.ak03@gte.com>
In-Reply-To: <20030327150313.A8897@iclub.nsu.ru>
References: <20030327020402.T64602-100000@mail.chesapeake.net>
	<3E82B795.DDB0C6A4@mindspring.com>
	<20030327150313.A8897@iclub.nsu.ru>
Organization: Verizon Data Services
X-Mailer: Sylpheed version 0.8.11claws42 (GTK+ 1.2.10;
	i386-portbld-freebsd5.0)
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=EMAIL_ATTRIBUTION,FROM_ENDS_IN_NUMS,IN_REP_TO,
	      QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 16:12:16 -0000

On Thu, 27 Mar 2003 15:03:14 +0600
Max Khon <fjoe@iclub.nsu.ru> wrote:

> Seconded. do you have numbers that clearly show that using Julian's
> approach leads to serious performance penalty? Using KSE APIs is not
> that difficult as far as I understand, so why we need to introduce
> more hacks?
> 

Disagreed. Using KSE APIs _is_ difficult. I think one of the ideas
behind 1:1 libth is to keep the code as simple as practical and
entangling it too strongly with KSE contradicts with that goal. I
certainly hope to see M:N threading project to come to completion in the
future, but keep in mind that the architecture this complex will
certainly take quite some time to mature and having a reliable fallback
option is good. If anything it will provide KSE people with something to
compare their implementation with.

-- 
Alexander Kabaev

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 08:46:43 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id D2A7537B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:46:43 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 32CA543F75
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:46:43 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RGkcBg027581;
	Thu, 27 Mar 2003 11:46:38 -0500 (EST)
Received: from localhost (eischen@localhost)h2RGkcDZ027578;
	Thu, 27 Mar 2003 11:46:38 -0500 (EST)
Date: Thu, 27 Mar 2003 11:46:38 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Scott Long <scott_long@btc.adaptec.com>
In-Reply-To: <3E8318B3.2020801@btc.adaptec.com>
Message-ID: <Pine.GSO.4.10.10303271132290.25558-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.3 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 16:46:44 -0000

On Thu, 27 Mar 2003, Scott Long wrote:
> Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE
> development.  By keeping the 1:1 and M:N API's separate, KSE can
> progress in 6-CURRENT until it is proven while still allowing MFC's to
> 5-STABLE to happen without too much pain.

That's kind of silly; we have other ways to keep API/ABI
compatability and have used this for all other syscalls.
The KSE and thread mailboxes even have version numbers
in them.

> Later on down the road when
> KSE matures, or when we decide that 1:1 should really just be a special
> case of M:N, we can look at addressing the above concerns and possibly
> MFC'ing the results back to 5-STABLE.  But for now we need to allow for
> 5-STABLE to actually be usable and maintainable.

The libthr implementation of 1:1 is not what most consider
1:1 -- you don't get a separate quantum and priority for
each thread.  As such, this library is really no different
than libkse.  The only real difference is that the UTS
chooses the next thread to run instead of the kernel.
If you're going to add a bunch of code to both userland
(in libthr) and the kernel just to get a working threading
library, it seems much easier to just fix libkse so that
it works for the single KSE/KSEG case.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 08:59:10 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 646AA37B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:59:10 -0800 (PST)
Received: from boromir.vpop.net (dns1.vpop.net [207.178.248.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E057843F93
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:59:09 -0800 (PST)
	(envelope-from mreimer@vpop.net)
Received: from vpop.net (bilbo.vpop.net [65.103.33.41])
	by boromir.vpop.net (Postfix) with ESMTP id 164083A6394
	for <arch@freebsd.org>; Thu, 27 Mar 2003 08:59:08 -0800 (PST)
Message-ID: <3E832E39.7040306@vpop.net>
Date: Thu, 27 Mar 2003 11:00:41 -0600
From: Matthew Reimer <mreimer@vpop.net>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030220
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: arch@freebsd.org
References: <lists.freebsd.arch.3E82F7EE.6080802@tcoip.com.br>
	<lists.freebsd.arch.Pine.NEB.3.96L.1030327100013.37107G-100000@fledge.watson.org>
In-Reply-To: <lists.freebsd.arch.Pine.NEB.3.96L.1030327100013.37107G-100000@fledge.watson.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,REFERENCES,
	      REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: Threading code review please.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 16:59:15 -0000

Robert Watson wrote:
> 
> Both 1:1 and M:N threading will improve performance of interactive
> applications if they spend any moderate amount of time I/O bound.  I've
> noticed substantial performance differences between instances of
> openoffice linked for libc_r and openoffice linked for linuxthreads --
> serializing I/O operations substantially impacts throughput and
> interactivty due to latency.  Try running the Linux-linked mozilla, the
> FreeBSD libc_r mozilla, and the FreeBSD linuxthreads mozilla and see how
> they compare. 

Where can one find a FreeBSD linuxthreads mozilla?

Matt

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 10:05:40 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3545837B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 10:05:40 -0800 (PST)
Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP id B3EF643FBD
	for <arch@freebsd.org>; Thu, 27 Mar 2003 10:05:29 -0800 (PST)
	(envelope-from dcs@tcoip.com.br)
Received: from tcoip.com.br ([10.0.2.6])
	by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2RI5H919221
	for <arch@freebsd.org>; Thu, 27 Mar 2003 15:05:17 -0300
Message-ID: <3E833D5D.10200@tcoip.com.br>
Date: Thu, 27 Mar 2003 15:05:17 -0300
From: "Daniel C. Sobral" <dcs@tcoip.com.br>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326
X-Accept-Language: en-us, en, pt-br, ja
MIME-Version: 1.0
To: arch@freebsd.org
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-18.9 required=5.0
	tests=AWL,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: 1-1 threading -- it seems to me...
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 18:05:42 -0000

Well, the RE seems to be firmly behind a production-level 1:1 threading 
implementation for 5.x, and leaving M:N development to 6.x, to be merged 
when and if things look promising.

Jeff has developed a 1:1 threading which avoids using much of existing 
KSE API.

KSE people would prefer to see a solution with much more integration.

Gentlemen, it seems to me this is a classic case of coding speaking 
louder than words.

-- 
Daniel C. Sobral                   (8-DCS)
Gerencia de Operacoes
Divisao de Comunicacao de Dados
Coordenacao de Seguranca
TCO
Fones: 55-61-313-7654/Cel: 55-61-9618-0904
E-mail: Daniel.Capo@tco.net.br
         Daniel.Sobral@tcoip.com.br
         dcs@tcoip.com.br

Outros:
	dcs@newsguy.com
	dcs@freebsd.org
	capo@notorious.bsdconspiracy.net

ARMADILLO:
	To provide weapons to a Spanish pickle.

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 11:42:11 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id EAB7E37B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 11:42:11 -0800 (PST)
Received: from rootlabs.com (root.org [67.118.192.226])
	by mx1.FreeBSD.org (Postfix) with SMTP id 1AFF743FBF
	for <arch@freebsd.org>; Thu, 27 Mar 2003 11:42:11 -0800 (PST)
	(envelope-from nate@rootlabs.com)
Received: (qmail 30059 invoked by uid 1000); 27 Mar 2003 19:42:11 -0000
Date: Thu, 27 Mar 2003 11:42:11 -0800 (PST)
From: Nate Lawson <nate@root.org>
To: arch@freebsd.org, current@freebsd.org
Message-ID: <Pine.BSF.4.21.0303271138130.30056-100000@root.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-13.5 required=5.0
	tests=AWL,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: 5.x locking plan
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 19:42:13 -0000

My curiousity has overcome my fear of the bikeshed so I'll ask the
question that has been bugging me for a while.  Why haven't we gone
through the tree and created a lock for each spl and then converted every
spl call into the appropriate mtx_lock call?  At that point, we can mark
large sections of the tree giant-free and then make the locking data-based
(instead of code-based) one section at a time.  This is the approach
Solaris took.

-Nate

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 11:51:08 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 07EF137B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 11:51:08 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 2BF6043F75
	for <arch@freebsd.org>; Thu, 27 Mar 2003 11:51:07 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2RJovA89523;
	Thu, 27 Mar 2003 14:50:57 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Thu, 27 Mar 2003 14:50:56 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
In-Reply-To: <Pine.GSO.4.10.10303271132290.25558-100000@pcnet1.pcnet.com>
Message-ID: <20030327143259.I64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-17.2 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 19:51:09 -0000

On Thu, 27 Mar 2003, Daniel Eischen wrote:

> On Thu, 27 Mar 2003, Scott Long wrote:
> > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE
> > development.  By keeping the 1:1 and M:N API's separate, KSE can
> > progress in 6-CURRENT until it is proven while still allowing MFC's to
> > 5-STABLE to happen without too much pain.
>
> That's kind of silly; we have other ways to keep API/ABI
> compatability and have used this for all other syscalls.
> The KSE and thread mailboxes even have version numbers
> in them.

Which means they are likely to change.  I do not want to develop on
unstable APIs and unstable kernel code.  kern_thr.c is 254 lines.  I think
we can handle a little duplication.  I'm not sure why the objection is so
strong.

>
> > Later on down the road when
> > KSE matures, or when we decide that 1:1 should really just be a special
> > case of M:N, we can look at addressing the above concerns and possibly
> > MFC'ing the results back to 5-STABLE.  But for now we need to allow for
> > 5-STABLE to actually be usable and maintainable.
>
> The libthr implementation of 1:1 is not what most consider
> 1:1 -- you don't get a separate quantum and priority for
> each thread.  As such, this library is really no different
> than libkse.  The only real difference is that the UTS
> chooses the next thread to run instead of the kernel.
> If you're going to add a bunch of code to both userland
> (in libthr) and the kernel just to get a working threading
> library, it seems much easier to just fix libkse so that
> it works for the single KSE/KSEG case.

It didn't seem much easier to me.

This whole argument about kseg/kse/thread vs kse/thread can be solved very
easily by allocating a ksegrp in kern_thr.c  I estimate that would add
another 10 lines of code.

The ksegrp argument is questionable anyway.  In both ULE and 4bds each KSE
gets its own quantum.  The KSEGRP holds the static priority and the
dynamic user priority which is calculated based on the behavior of the
whole process.  This causes all threads in the process to be penalized for
using cpu at the same rate as a single threaded process using an
equivalent amount of cpu would be.

The effects are less because each thread/kse is given as big of a quantum
as each full process would.  I'm not sure if this is a bug or a feature.

In my opnion the ksegrp is not totally hashed out. I think you may forget
that I have done a fair amount of work on schedulers in freebsd and I do
understand the ramification of the decision that I made.  I do not think
this at all important to have correct prior to having real users using
real threads.

Cheers,
Jeff

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 12:09:53 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 4AD1537B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 12:09:53 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 98B8843FA3
	for <arch@freebsd.org>; Thu, 27 Mar 2003 12:09:52 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RK9nBg026712;
	Thu, 27 Mar 2003 15:09:49 -0500 (EST)
Received: from localhost (eischen@localhost)h2RK9m7h026709;
	Thu, 27 Mar 2003 15:09:48 -0500 (EST)
Date: Thu, 27 Mar 2003 15:09:48 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net>
Message-ID: <Pine.GSO.4.10.10303271456430.24745-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 20:09:54 -0000

On Thu, 27 Mar 2003, Jeff Roberson wrote:

> On Thu, 27 Mar 2003, Daniel Eischen wrote:
> 
> > On Thu, 27 Mar 2003, Scott Long wrote:
> > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE
> > > development.  By keeping the 1:1 and M:N API's separate, KSE can
> > > progress in 6-CURRENT until it is proven while still allowing MFC's to
> > > 5-STABLE to happen without too much pain.
> >
> > That's kind of silly; we have other ways to keep API/ABI
> > compatability and have used this for all other syscalls.
> > The KSE and thread mailboxes even have version numbers
> > in them.
> 
> Which means they are likely to change.  I do not want to develop on
> unstable APIs and unstable kernel code.  kern_thr.c is 254 lines.  I think
> we can handle a little duplication.  I'm not sure why the objection is so
> strong.

I don't see kse_create() changing since it takes a
mailbox pointer as an argument and you can theoretically
hang anything off the [versioned] mailbox.

> > > Later on down the road when
> > > KSE matures, or when we decide that 1:1 should really just be a special
> > > case of M:N, we can look at addressing the above concerns and possibly
> > > MFC'ing the results back to 5-STABLE.  But for now we need to allow for
> > > 5-STABLE to actually be usable and maintainable.
> >
> > The libthr implementation of 1:1 is not what most consider
> > 1:1 -- you don't get a separate quantum and priority for
> > each thread.  As such, this library is really no different
> > than libkse.  The only real difference is that the UTS
> > chooses the next thread to run instead of the kernel.
> > If you're going to add a bunch of code to both userland
> > (in libthr) and the kernel just to get a working threading
> > library, it seems much easier to just fix libkse so that
> > it works for the single KSE/KSEG case.
> 
> It didn't seem much easier to me.

For the single KSE/KSEG case it's almost there.  There
are just a couple of issues involving signals and some
bugs.  It's basically libc_r with the UTS swapped out for
a KSE-one.

I haven't spent any time on it because I wanted to come
at it from a different angle; rewriting it with KSE/KSEGs
in mind instead of just porting it.

> This whole argument about kseg/kse/thread vs kse/thread can be solved very
> easily by allocating a ksegrp in kern_thr.c  I estimate that would add
> another 10 lines of code.
> 
> The ksegrp argument is questionable anyway.  In both ULE and 4bds each KSE
> gets its own quantum.  The KSEGRP holds the static priority and the
> dynamic user priority which is calculated based on the behavior of the
> whole process.  This causes all threads in the process to be penalized for
> using cpu at the same rate as a single threaded process using an
> equivalent amount of cpu would be.

That wasn't my understanding of how KSE's were suppose to work.
The orignal idea was that the quantum and priorities were suppose
to be in the KSE Group.  Yes, two KSEs could get scheduled
simulataneously on different CPUs and consume 2 quantums, but
the KSE Group would get charged for both causing them to run
less often.  Or something like that.  In effect, over time
2 KSEs in a group would get no more processor time than a
non-KSEd process (all other things being equal).

I originally argued that it didn't make sense to have both
a KSE group and a KSE; that they could be one and the same.
I lost the argument :-)

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 12:21:35 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 4CC6337B401; Thu, 27 Mar 2003 12:21:35 -0800 (PST)
Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id B522E43F93; Thu, 27 Mar 2003 12:21:34 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: from apollo.backplane.com (localhost [127.0.0.1])
	by apollo.backplane.com (8.12.8/8.12.6) with ESMTP id h2RKLY31049841;
	Thu, 27 Mar 2003 12:21:34 -0800 (PST)
	(envelope-from dillon@apollo.backplane.com)
Received: (from dillon@localhost)
	by apollo.backplane.com (8.12.8/8.12.6/Submit) id h2RKLYo8049840;
	Thu, 27 Mar 2003 12:21:34 -0800 (PST)
Date: Thu, 27 Mar 2003 12:21:34 -0800 (PST)
From: Matthew Dillon <dillon@apollo.backplane.com>
Message-Id: <200303272021.h2RKLYo8049840@apollo.backplane.com>
To: Nate Lawson <nate@root.org>
References: <Pine.BSF.4.21.0303271138130.30056-100000@root.org>
X-Spam-Status: No, hits=-7.1 required=5.0
	tests=AWL,REFERENCES
	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: current@freebsd.org
Subject: Re: 5.x locking plan
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 20:21:37 -0000


:My curiousity has overcome my fear of the bikeshed so I'll ask the
:question that has been bugging me for a while.  Why haven't we gone
:through the tree and created a lock for each spl and then converted every
:spl call into the appropriate mtx_lock call?  At that point, we can mark
:large sections of the tree giant-free and then make the locking data-based
:(instead of code-based) one section at a time.  This is the approach
:Solaris took.
:
:-Nate

    The problem is that SPLs are per-thread masks, and different sets of
    bits can be added or removed from the master mask in any order and at
    any time.  There is no direct translation to a mutex (which cannot
    be obtained in random order, is not per-thread, and may result in 
    preemption or a context switch).

    Most of the code locked under Giant assumes the single-threading of
    kernel threads regardless of the SPL.  This 'inherent' single threading
    is one the reasons why the original code was so efficient.
    Since preemption can occur now under many new circumstances, including 
    when 'normal' (non-spin) mutexes are used to replace prior uses of SPLs
    (which could not cause thread level preemption)...  well, it basically
    means there is no easy way to remove Giant short of going through every
    bit of code and fixing it one subsystem at a time.

    Giant itself is a special case.  It is not a normal mutex.  Instead, the
    kernel very carefully saves and restores the state of Giant on a 
    per-thread basis so programs don't 'need to know' whether Giant is being
    held or not and so Giant can be held in combination with another mutex
    without violating the basic 'only one mutex can be held when going to
    sleep' rule.

					-Matt
					Matthew Dillon 
					<dillon@backplane.com>

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 14:49:56 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 8A00237B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 14:49:56 -0800 (PST)
Received: from duke.cs.duke.edu (duke.cs.duke.edu [152.3.140.1])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4FD5843F75
	for <arch@freebsd.org>; Thu, 27 Mar 2003 14:49:55 -0800 (PST)
	(envelope-from gallatin@cs.duke.edu)
Received: from grasshopper.cs.duke.edu (grasshopper.cs.duke.edu
	[152.3.145.30])
	by duke.cs.duke.edu (8.12.8/8.12.8) with ESMTP id h2RMnsRv011567
	(version=TLSv1/SSLv3 cipher=EDH-RSA-DES-CBC3-SHA bits=168 verify=NO);
	Thu, 27 Mar 2003 17:49:54 -0500 (EST)
Received: (from gallatin@localhost)
	by grasshopper.cs.duke.edu (8.11.6/8.9.1) id h2RMnnY18612;
	Thu, 27 Mar 2003 17:49:49 -0500 (EST)
	(envelope-from gallatin@cs.duke.edu)
From: Andrew Gallatin <gallatin@cs.duke.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <16003.32780.950519.931661@grasshopper.cs.duke.edu>
Date: Thu, 27 Mar 2003 17:49:48 -0500 (EST)
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
In-Reply-To: <Pine.GSO.4.10.10303271456430.24745-100000@pcnet1.pcnet.com>
References: <20030327143259.I64602-100000@mail.chesapeake.net>
	<Pine.GSO.4.10.10303271456430.24745-100000@pcnet1.pcnet.com>
X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid
X-Spam-Status: No, hits=-22.9 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,REFERENCES,REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
Reply-To: arch@freebsd.org
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 22:49:57 -0000


Daniel Eischen writes:
 > On Thu, 27 Mar 2003, Jeff Roberson wrote:
 > 
 > > On Thu, 27 Mar 2003, Daniel Eischen wrote:
 > > 
 > > Which means they are likely to change.  I do not want to develop on
 > > unstable APIs and unstable kernel code.  kern_thr.c is 254 lines.  I think
 > > we can handle a little duplication.  I'm not sure why the objection is so
 > > strong.
 > 
 > I don't see kse_create() changing since it takes a
 > mailbox pointer as an argument and you can theoretically
 > hang anything off the [versioned] mailbox.

According to the 5-stable roadmap at 
	  http://www.freebsd.org/doc/en/articles/5-roadmap/major-issues.html

   KSE kernel and userland components must be functionality complete
   by June 2003 in order to be included in the RELENG_5 branch. For
   security and stability reasons, if KSE cannot be finished in time
   then, by default, all KSE-specific syscalls should be modified to
   return ENOSYS and all other KSE-specific interfaces disabled.

By not depending on KSE infastructure, the 1:1 can still be available
in 5.1 in exactly the same fore regardless of whether or not KSE makes
the June deadline or not.

Drew

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 15:19:46 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3B92E37B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 15:19:46 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 7D97E4400E
	for <arch@freebsd.org>; Thu, 27 Mar 2003 15:19:45 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2RNJiBg024884
	for <arch@freebsd.org>; Thu, 27 Mar 2003 18:19:44 -0500 (EST)
Received: from localhost (eischen@localhost)h2RNJiha024881
	for <arch@freebsd.org>; Thu, 27 Mar 2003 18:19:44 -0500 (EST)
Date: Thu, 27 Mar 2003 18:19:44 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: arch@freebsd.org
In-Reply-To: <16003.32780.950519.931661@grasshopper.cs.duke.edu>
Message-ID: <Pine.GSO.4.10.10303271818110.24399-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.7 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 23:19:47 -0000

On Thu, 27 Mar 2003, Andrew Gallatin wrote:

> 
> Daniel Eischen writes:
>  > On Thu, 27 Mar 2003, Jeff Roberson wrote:
>  > 
>  > > On Thu, 27 Mar 2003, Daniel Eischen wrote:
>  > > 
>  > > Which means they are likely to change.  I do not want to develop on
>  > > unstable APIs and unstable kernel code.  kern_thr.c is 254 lines.  I think
>  > > we can handle a little duplication.  I'm not sure why the objection is so
>  > > strong.
>  > 
>  > I don't see kse_create() changing since it takes a
>  > mailbox pointer as an argument and you can theoretically
>  > hang anything off the [versioned] mailbox.
> 
> According to the 5-stable roadmap at 
> 	  http://www.freebsd.org/doc/en/articles/5-roadmap/major-issues.html
> 
>    KSE kernel and userland components must be functionality complete
>    by June 2003 in order to be included in the RELENG_5 branch. For
>    security and stability reasons, if KSE cannot be finished in time
>    then, by default, all KSE-specific syscalls should be modified to
>    return ENOSYS and all other KSE-specific interfaces disabled.

This sounds like an argument to use the KSE syscalls :-)
If libthr is based on KSE and it works, then you've accomplished
the above.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 15:49:05 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 3E8A237B401
	for <arch@freebsd.org>; Thu, 27 Mar 2003 15:49:05 -0800 (PST)
Received: from magic.adaptec.com (magic-mail.adaptec.com [208.236.45.100])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9ACCD43FBD
	for <arch@freebsd.org>; Thu, 27 Mar 2003 15:49:04 -0800 (PST)
	(envelope-from scott_long@btc.adaptec.com)
Received: from redfish.adaptec.com (redfish.adaptec.com [162.62.50.11])
	by magic.adaptec.com (8.11.6/8.11.6) with ESMTP id h2RNm6l00449;
	Thu, 27 Mar 2003 15:48:06 -0800
Received: from btc.btc.adaptec.com (btc.btc.adaptec.com [10.100.0.52])
	by redfish.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id PAA25978;
	Thu, 27 Mar 2003 15:48:57 -0800 (PST)
Received: from btc.adaptec.com (hollin [10.100.253.56])
	by btc.btc.adaptec.com (8.8.8+Sun/8.8.8) with ESMTP id QAA09390;
	Thu, 27 Mar 2003 16:48:54 -0700 (MST)
Message-ID: <3E838D57.4050305@btc.adaptec.com>
Date: Thu, 27 Mar 2003 16:46:31 -0700
From: Scott Long <scott_long@btc.adaptec.com>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.2b) Gecko/20021216
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
References: <Pine.GSO.4.10.10303271818110.24399-100000@pcnet1.pcnet.com>
In-Reply-To: <Pine.GSO.4.10.10303271818110.24399-100000@pcnet1.pcnet.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-31.9 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2003 23:49:06 -0000

Daniel Eischen wrote:

> On Thu, 27 Mar 2003, Andrew Gallatin wrote:
>
>
> >Daniel Eischen writes:
> > > On Thu, 27 Mar 2003, Jeff Roberson wrote:
> > >
> > > > On Thu, 27 Mar 2003, Daniel Eischen wrote:
> > > >
> > > > Which means they are likely to change.  I do not want to develop on
> > > > unstable APIs and unstable kernel code.  kern_thr.c is 254 
> lines.  I think
> > > > we can handle a little duplication.  I'm not sure why the 
> objection is so
> > > > strong.
> > >
> > > I don't see kse_create() changing since it takes a
> > > mailbox pointer as an argument and you can theoretically
> > > hang anything off the [versioned] mailbox.
> >
> >According to the 5-stable roadmap at
> >	  http://www.freebsd.org/doc/en/articles/5-roadmap/major-issues.html
> >
> >   KSE kernel and userland components must be functionality complete
> >   by June 2003 in order to be included in the RELENG_5 branch. For
> >   security and stability reasons, if KSE cannot be finished in time
> >   then, by default, all KSE-specific syscalls should be modified to
> >   return ENOSYS and all other KSE-specific interfaces disabled.
>
>
> This sounds like an argument to use the KSE syscalls :-)
> If libthr is based on KSE and it works, then you've accomplished
> the above.
>
The 5-stable roadmap document was written before, and without any
knowledge of, Jeff's work.  The purpose of the above paragraph was
to define a deadline for the threading work to be done.  With the
advent of libthr, there is no longer pressure for the KSE kernel and
userland components to be complete for RELENG_5, so the quoted
paragraph can be relaxed a little bit.  I'm not sure if I would feel
comfortable shipping a release with the KSE syscalls turned on but
no libkse to interact with them, but that can be discussed further.

The bigger picture is that libthr is at the point now that I wanted
libkse to be at in 3 months.  Some may be grumpy and feel that libthr
has subverted libkse, however I'd like to remind everyone that Jon and
Jeff were under no contractual obligation to work on libkse.  Their
work does, however, solve the pressing need for a working threading
library for 5-STABLE.  If someone wants to pick up the torch for
libkse and finish it by the June deadline, I would be thrilled.
Otherwise, I'm happy with libthr for 5-STABLE, and I encourage people
to finish libkse and M:N for 6.0.

As for the arguments of libthr creating new syscalls, I'll point out
that KSE and libkse have not yet run any real-world applications, and
therefore are hard to even consider as 'alpha' quality.  It's foolish
to assume that the KSE interfaces will not change as KSE matures.
If libthr is tied to the current interface, it creates a maintenance
nightmare as the interface changes, especially for the RELENG_5
branch.  I realize that it also creates baggage once libkse is the
default, but that can be solved by deprecating the libthr interfaces
for 6.x and removing them for 7.x.  It's a small price to pay for such
a huge benefit.

Scott

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 17:19:57 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A2F8737B407
	for <arch@freebsd.org>; Thu, 27 Mar 2003 17:19:57 -0800 (PST)
Received: from exchhz01.viatech.com.cn (ip-167-164-97-218.anlai.com
	[218.97.164.167])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 9CC3343FA3
	for <arch@freebsd.org>; Thu, 27 Mar 2003 17:19:55 -0800 (PST)
	(envelope-from davidxu@freebsd.org)
Received: from davidw2k (ip-240-1-168-192.rev.dyxnet.com [192.168.1.240]) by
	exchhz01.viatech.com.cn with SMTP (Microsoft Exchange Internet Mail Service
	Version 5.5.2650.21)	id HLDQ3B6M; Fri, 28 Mar 2003 09:06:22 +0800
Message-ID: <005201c2f4c8$517da320$f001a8c0@davidw2k>
From: "David Xu" <davidxu@freebsd.org>
To: "Jeff Roberson" <jroberson@chesapeake.net>,
	"Daniel Eischen" <eischen@pcnet1.pcnet.com>
References: <20030327143259.I64602-100000@mail.chesapeake.net>
Date: Fri, 28 Mar 2003 09:21:11 +0800
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4807.1700
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4910.0300
X-Spam-Status: No, hits=-9.0 required=5.0
	tests=AWL,QUOTED_EMAIL_TEXT,REFERENCES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 01:19:59 -0000


----- Original Message -----=20
From: "Jeff Roberson" <jroberson@chesapeake.net>
To: "Daniel Eischen" <eischen@pcnet1.pcnet.com>
Cc: <arch@freebsd.org>; "Scott Long" <scott_long@btc.adaptec.com>
Sent: Friday, March 28, 2003 3:50 AM
Subject: Re: 1:1 threading.


> On Thu, 27 Mar 2003, Daniel Eischen wrote:
>=20
> > On Thu, 27 Mar 2003, Scott Long wrote:
> > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs =
for KSE
> > > development.  By keeping the 1:1 and M:N API's separate, KSE can
> > > progress in 6-CURRENT until it is proven while still allowing =
MFC's to
> > > 5-STABLE to happen without too much pain.
> >
> > That's kind of silly; we have other ways to keep API/ABI
> > compatability and have used this for all other syscalls.
> > The KSE and thread mailboxes even have version numbers
> > in them.
>=20
> Which means they are likely to change.  I do not want to develop on
> unstable APIs and unstable kernel code.  kern_thr.c is 254 lines.  I =
think
> we can handle a little duplication.  I'm not sure why the objection is =
so
> strong.
>=20
> >
> > > Later on down the road when
> > > KSE matures, or when we decide that 1:1 should really just be a =
special
> > > case of M:N, we can look at addressing the above concerns and =
possibly
> > > MFC'ing the results back to 5-STABLE.  But for now we need to =
allow for
> > > 5-STABLE to actually be usable and maintainable.
> >
> > The libthr implementation of 1:1 is not what most consider
> > 1:1 -- you don't get a separate quantum and priority for
> > each thread.  As such, this library is really no different
> > than libkse.  The only real difference is that the UTS
> > chooses the next thread to run instead of the kernel.
> > If you're going to add a bunch of code to both userland
> > (in libthr) and the kernel just to get a working threading
> > library, it seems much easier to just fix libkse so that
> > it works for the single KSE/KSEG case.
>=20
> It didn't seem much easier to me.
>=20
> This whole argument about kseg/kse/thread vs kse/thread can be solved =
very
> easily by allocating a ksegrp in kern_thr.c  I estimate that would add
> another 10 lines of code.
>=20
> The ksegrp argument is questionable anyway.  In both ULE and 4bds each =
KSE
> gets its own quantum.  The KSEGRP holds the static priority and the
> dynamic user priority which is calculated based on the behavior of the
> whole process.  This causes all threads in the process to be penalized =
for
> using cpu at the same rate as a single threaded process using an
> equivalent amount of cpu would be.
>=20
> The effects are less because each thread/kse is given as big of a =
quantum
> as each full process would.  I'm not sure if this is a bug or a =
feature.
>=20
> In my opnion the ksegrp is not totally hashed out. I think you may =
forget
> that I have done a fair amount of work on schedulers in freebsd and I =
do
> understand the ramification of the decision that I made.  I do not =
think
> this at all important to have correct prior to having real users using
> real threads.
>=20

do you think that a multithreaded process should use more CPU time then
a single thread process, so threaded process should have higher priority
and block other single thread processes out? AFAIK, threading is not=20
designed for this, you may misunderstand what threading is designed for.

> Cheers,
> Jeff
>=20
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to =
"freebsd-arch-unsubscribe@freebsd.org"

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 18:12:17 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 393B137B404; Thu, 27 Mar 2003 18:12:17 -0800 (PST)
Received: from mail01.stbernard.com (mail01.stbernard.com [64.154.93.162])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 20C5643FBF; Thu, 27 Mar 2003 18:12:15 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from salty.rapid.stbernard.com ([192.168.4.61]) by
	mail01.stbernard.com with Microsoft SMTPSVC(5.0.2195.5329);
	Thu, 27 Mar 2003 18:12:14 -0800
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr.com
To: "Poul-Henning Kamp" <phk@phk.freebsd.dk>,
	Marcel Moolenaar <marcel@xcllnt.net>
Date: Thu, 27 Mar 2003 18:12:13 -0800
User-Agent: KMail/1.5
References: <14594.1048582113@critter.freebsd.dk>
In-Reply-To: <14594.1048582113@critter.freebsd.dk>
X-Habeas-SWE-1: winter into spring
X-Habeas-SWE-2: brightly anticipated
X-Habeas-SWE-3: like Habeas SWE (tm)
X-Habeas-SWE-4: Copyright 2002 Habeas (tm)
X-Habeas-SWE-5: Sender Warranted Email (SWE) (tm). The sender of this
X-Habeas-SWE-6: email in exchange for a license for this Habeas
X-Habeas-SWE-7: warrant mark warrants that this is a Habeas Compliant
X-Habeas-SWE-8: Message (HCM) and not spam. Please report use of this
X-Habeas-SWE-9: mark in spam to <http://www.habeas.com/report/>.   
MIME-Version: 1.0
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200303271812.13745.wes@softweyr.com>
X-OriginalArrivalTime: 28 Mar 2003 02:12:14.0027 (UTC)
	FILETIME=[72BE31B0:01C2F4CF]
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      RCVD_IN_UNCONFIRMED_DSBL,REFERENCES,REPLY_WITH_QUOTES,
	      USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: David Schultz <das@FreeBSD.ORG>
cc: freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 02:12:20 -0000

On Tuesday 25 March 2003 00:48, Poul-Henning Kamp wrote:
> In message <20030325084247.GA17195@dhcp01.pn.xcllnt.net>, Marcel
> Moolenaar writes:
> >> To tackle them from behind:
> >>
> >> Wes has a proposal for #3 which is a per-process flag which says
> >> "I'm sacred".  I think that is a sound principle since that is
> >> usually exactly what people want:  Do Not Kill This Process.
> >>
> >> Certain processes already enjoy special protection, pid==1 most
> >> notably, this would just be a way to make the same protection
> >> available to other processes.  I'm not happy about using the
> >> resourcelimit code for booleans, and I don't think the flag
> >> should be inherited, but otherwise I'm for the idea.
> >
> >JFYI: On ia64 there are 12 bits in the ELF header reserved for OS
> >specific flags. A very natural way to flag a process as being sacred
> >is by flagging the ELF executable. You could use brandelf for that.
>
> Many years ago, we had a local hack so you could specify the nice(2)
> that a given program would be executed at (relative to the parent
> process) in the a.out file.  This allowed us to keep games open
> during the day because we could argue that running at -20 they used
> only resources not otherwise claimed.
>
> Other operating systems have much more expressive facilities for
> putting attributes on a program.  In some cases this is being held
> stronly against them.

You could easily implement this with an ELF executable by adding "note" 
section(s) containing the attributes in a format understood by your 
loader or linker.  A hackup of brandelf could modify the binaries in 
well-specified ways.

You could also do this with extended attributes on the executable/ 
library files.  

> I think, but am not sure, that we can now introduce practically any
> policy we might like with MAC. (NB: deliberate rwatson-trigger)
>
> How the flags/attributes gets to be set on the wanted subset of
> processes is by no means uninteresting, but until something pays
> attention to the flag...

Working on it.

-- 
         "Where am I, and what am I doing in this handbasket?"

Wes Peters                                              wes@softweyr.com


From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 22:24:57 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B1FC237B408
	for <arch@freebsd.org>; Thu, 27 Mar 2003 22:24:57 -0800 (PST)
Received: from sccrmhc02.attbi.com (sccrmhc02.attbi.com [204.127.202.62])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0181543F75
	for <arch@freebsd.org>; Thu, 27 Mar 2003 22:24:57 -0800 (PST)
	(envelope-from julian@elischer.org)
Received: from interjet.elischer.org
	(12-232-168-4.client.attbi.com[12.232.168.4])
	by sccrmhc02.attbi.com (sccrmhc02) with ESMTP
	id <2003032806245500200jc503e>; Fri, 28 Mar 2003 06:24:56 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id WAA65920;
	Thu, 27 Mar 2003 22:24:54 -0800 (PST)
Date: Thu, 27 Mar 2003 22:24:51 -0800 (PST)
From: Julian Elischer <julian@elischer.org>
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net>
Message-ID: <Pine.BSF.4.21.0303272210340.65796-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.0 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REPLY_WITH_QUOTES,
	      USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
cc: Daniel Eischen <eischen@pcnet1.pcnet.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 06:25:00 -0000


On Thu, 27 Mar 2003, Jeff Roberson wrote:

> 
> The effects are less because each thread/kse is given as big of a quantum
> as each full process would.  I'm not sure if this is a bug or a feature.
> 

It's neither.. it's not what happens.
 More accuratly, it's only part of the story.

Firstly it's 'standin' code.. but it exhibis some of the desired
bahaviour. Yes, each KSE gets a quantum, but the next thread to run in
that KSE is forced to go to the end of the queue.  Effecively, this
forces the process to allow other processes with enough priority to get
CPU. This is a 'quick' solution to stopping a process with a lot of
threads from swamping the system. The plan is to put in place a more
comprehensive solution when time allows. Only one thread for each KSE
can actually be on the run queue at a time.. yes it could run for the
entire quantum, but the system will not let it put 10000 of these back
to back unless there are no other competing threads. There is room there
for a graduate student to do a project adding code to allow the KSEGRP
to allow the rest of it's quantum to be passed on to other equivalent
priority threads in the same group. This would be, as dicussed several
times, in the form of code in choosethread() that would first check for
internal threads if there was quantum left, before resorting to external
threads to run. (The trick is to get the right balance)

From owner-freebsd-arch@FreeBSD.ORG  Thu Mar 27 22:29:02 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id B72B737B401
	for <freebsd-arch@freebsd.org>; Thu, 27 Mar 2003 22:29:02 -0800 (PST)
Received: from park.rambler.ru (park.rambler.ru [81.19.64.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E16B543F85
	for <freebsd-arch@freebsd.org>; Thu, 27 Mar 2003 22:29:00 -0800 (PST)
	(envelope-from is@rambler-co.ru)
Received: from is.park.rambler.ru (is.park.rambler.ru [81.19.64.102])
	by park.rambler.ru (8.12.6/8.12.6) with ESMTP id h2S6SsmF022454;
	Fri, 28 Mar 2003 09:28:54 +0300 (MSK)
Date: Fri, 28 Mar 2003 09:28:54 +0300 (MSK)
From: Igor Sysoev <is@rambler-co.ru>
X-Sender: is@is
To: freebsd-arch@freebsd.org
In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net>
Message-ID: <Pine.BSF.4.21.0303280927250.19745-100000@is>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-20.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 06:29:03 -0000

On Thu, 27 Mar 2003, Jeff Roberson wrote:

>The ksegrp argument is questionable anyway.  In both ULE and 4bds each KSE
>gets its own quantum.  The KSEGRP holds the static priority and the
>dynamic user priority which is calculated based on the behavior of the
>whole process.  This causes all threads in the process to be penalized for
>using cpu at the same rate as a single threaded process using an
>equivalent amount of cpu would be.

Why should multi-threaded process get more CPU time then single threaded
if they both have the same base priority ?  CPU time should be given
based on a process priority not a number of its threads.

>The effects are less because each thread/kse is given as big of a quantum
>as each full process would.  I'm not sure if this is a bug or a feature.

It's not a bug or a feature.  It's the right thing.

>In my opnion the ksegrp is not totally hashed out. I think you may forget
>that I have done a fair amount of work on schedulers in freebsd and I do
>understand the ramification of the decision that I made.  I do not think
>this at all important to have correct prior to having real users using
>real threads.

As I understand KSEGRP was designed with M:N model in mind.  If you
have M threads mapped to N KSEs then all these KSEs should have the same
priority.  The second KSEGRP capability is to limit a number of KSEs to
a number of CPUs.  It's usefull for M:N model because KSE is almost
never (I believe) blocked and always ready to run (if not parked).

For 1:1 model KSEGRP is not theoreticaly needed because you can set
priority (theoreticaly) directly in KSE and you do not need to limit
a number of KSEs to a number of CPUs.  If the thread blocks then its KSE
blocks too.

But I think for design completeness you should use KSEGRP to store KSE's
priority in 1:1 model.


Igor Syseov
http://sysoev.ru/en/

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 03:20:48 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 24F1537B401; Fri, 28 Mar 2003 03:20:47 -0800 (PST)
Received: from mail.tcoip.com.br (erato.tco.net.br [200.220.254.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 5E96743F3F; Fri, 28 Mar 2003 03:20:44 -0800 (PST)
	(envelope-from dcs@tcoip.com.br)
Received: from tcoip.com.br ([10.0.2.6])
	by mail.tcoip.com.br (8.11.6/8.11.6) with ESMTP id h2SBKf906269;
	Fri, 28 Mar 2003 08:20:41 -0300
Message-ID: <3E843009.2060104@tcoip.com.br>
Date: Fri, 28 Mar 2003 08:20:41 -0300
From: "Daniel C. Sobral" <dcs@tcoip.com.br>
User-Agent: Mozilla/5.0 (X11; U; FreeBSD i386; en-US; rv:1.3b) Gecko/20030326
X-Accept-Language: en-us, en, pt-br, ja
MIME-Version: 1.0
To: David Xu <davidxu@freebsd.org>
References: <20030327143259.I64602-100000@mail.chesapeake.net>
	<005201c2f4c8$517da320$f001a8c0@davidw2k>
In-Reply-To: <005201c2f4c8$517da320$f001a8c0@davidw2k>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=-28.6 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REFERENCES,REPLY_WITH_QUOTES,USER_AGENT_MOZILLA_UA
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 11:20:49 -0000

David Xu wrote:
> 
> do you think that a multithreaded process should use more CPU time then
> a single thread process, so threaded process should have higher priority
> and block other single thread processes out? AFAIK, threading is not 
> designed for this, you may misunderstand what threading is designed for.

Threading might not have been originally designed for this, but a lot of 
people use it this way, a lot of people *want* it this way, and POSIX 
specifically mandates that this way be available.

So let's drop that issue, please.

-- 
Daniel C. Sobral                   (8-DCS)
Gerencia de Operacoes
Divisao de Comunicacao de Dados
Coordenacao de Seguranca
TCO
Fones: 55-61-313-7654/Cel: 55-61-9618-0904
E-mail: Daniel.Capo@tco.net.br
         Daniel.Sobral@tcoip.com.br
         dcs@tcoip.com.br

Outros:
	dcs@newsguy.com
	dcs@freebsd.org
	capo@notorious.bsdconspiracy.net

After an instrument has been assembled, extra components will be found
on the bench.

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 09:10:38 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 9C99937B401; Fri, 28 Mar 2003 09:10:38 -0800 (PST)
Received: from smtp-relay.omnis.com (smtp-relay.omnis.com [216.239.128.27])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id D15DE43F93; Fri, 28 Mar 2003 09:10:37 -0800 (PST)
	(envelope-from wes@softweyr.com)
Received: from softweyr.homeunix.net (66-91-236-204.san.rr.com
	[66.91.236.204])	by smtp-relay.omnis.com (Postfix) with ESMTP
	id 0C6A843B68; Fri, 28 Mar 2003 09:10:36 -0800 (PST)
From: Wes Peters <wes@softweyr.com>
Organization: Softweyr
To: "Poul-Henning Kamp" <phk@phk.freebsd.dk>,
	David Schultz <das@FreeBSD.ORG>
Date: Fri, 28 Mar 2003 09:10:32 -0800
User-Agent: KMail/1.5
References: <14382.1048580753@critter.freebsd.dk>
In-Reply-To: <14382.1048580753@critter.freebsd.dk>
MIME-Version: 1.0
Content-Type: Multipart/Mixed;
  boundary="Boundary-00=_IIIh+WfRJfXpokV"
Message-Id: <200303280910.32307.wes@softweyr.com>
X-Spam-Status: No, hits=-29.1 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,PATCH_UNIFIED_DIFF,
	      QUOTED_EMAIL_TEXT,REFERENCES,REPLY_WITH_QUOTES,USER_AGENT
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
X-Content-Filtered-By: Mailman/MimeDel 2.1.1
cc: freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 17:11:09 -0000


--Boundary-00=_IIIh+WfRJfXpokV
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

On Tuesday 25 March 2003 00:25, Poul-Henning Kamp wrote:
>
> As I see it, there is a need for several mechanisms:
>
> 1. A mechanism to export to userland enough information about the
>    current RAM availability, so that phkmalloc and application
>    specific code can make intelligent choices before things go bad.
>
> 2. A mechanism to alert userland to the fact that things _have_ gone
>    bad.
>
> 3. A mechanism to influence the "Who do we kill ?" decision once
>    things have gone from bad to worse.
>
> To tackle them from behind:
>
> Wes has a proposal for #3 which is a per-process flag which says
> "I'm sacred".  I think that is a sound principle since that is
> usually exactly what people want:  Do Not Kill This Process.
>
> Certain processes already enjoy special protection, pid==1 most
> notably, this would just be a way to make the same protection
> available to other processes.  I'm not happy about using the
> resourcelimit code for booleans, and I don't think the flag
> should be inherited, but otherwise I'm for the idea.

I've reworked my patch to use the madvise(2) syscall, like the original 
4.x patch did.  I've even documented it, in a man page of all places.  
Please see attached patch.  If nobody objects, I'll commit sometime this 
weekend.

> We have the SIGDANGER proposal for #2, but I think we need to have
> two severities:  "Out of RAM" and "Out of VM".  A program like
> fsck would start to recycle cached sectors once we're out of RAM.

I'll work with Garance to create a proposal, some pseudocode, something 
like a design.  Then we can bikeshed that.  Mike Murphy is helping 
silently at work, letting me bounce ideas off him and look at the man 
pages on his AIX machine.

> But I have not seen anybody come up with a good proposal for
> #1, and that is where the main benefit would be derived:  It would
> allow processes to be good citizens and adjust to the present
> situation.

Added to think-about queue...

-- 

        Where am I, and what am I doing in this handbasket?

Wes Peters                                               wes@softweyr.com

--Boundary-00=_IIIh+WfRJfXpokV--

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 11:28:23 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id 400C437B408; Fri, 28 Mar 2003 11:28:23 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id 731C243F85; Fri, 28 Mar 2003 11:28:22 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2SJSIBg006924;
	Fri, 28 Mar 2003 14:28:18 -0500 (EST)
Received: from localhost (eischen@localhost)h2SJSHk2006921;
	Fri, 28 Mar 2003 14:28:17 -0500 (EST)
Date: Fri, 28 Mar 2003 14:28:17 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: "Daniel C. Sobral" <dcs@tcoip.com.br>
In-Reply-To: <3E843009.2060104@tcoip.com.br>
Message-ID: <Pine.GSO.4.10.10303281352530.242-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 19:28:26 -0000

On Fri, 28 Mar 2003, Daniel C. Sobral wrote:

> David Xu wrote:
> > 
> > do you think that a multithreaded process should use more CPU time then
> > a single thread process, so threaded process should have higher priority
> > and block other single thread processes out? AFAIK, threading is not 
> > designed for this, you may misunderstand what threading is designed for.
> 
> Threading might not have been originally designed for this, but a lot of 
> people use it this way, a lot of people *want* it this way, and POSIX 
> specifically mandates that this way be available.

It is available through pthread_attr_setscope().

There's some confusion over this and the way libthr is implemented.
KSE's within the same KSE Group were not designed to give more CPU
time than a normal unthreaded/single KSE'd process.  Unless this
has been changed in the kernel somehow, the use of multiple KSEs
by libthr or libkse (in a single KSEG) will not get any more CPU
time than a non-threaded program.  There was some debate over
this, but multiple KSEs within a KSEG were _not_ suppose to allow
this.  You are suppose to create a new KSEG in order to get
this behavior.

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 12:22:51 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 5525D37B401
	for <arch@freebsd.org>; Fri, 28 Mar 2003 12:22:51 -0800 (PST)
Received: from mail.chesapeake.net (chesapeake.net [205.130.220.14])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 6170B43F93
	for <arch@freebsd.org>; Fri, 28 Mar 2003 12:22:50 -0800 (PST)
	(envelope-from jroberson@chesapeake.net)
Received: from localhost (jroberson@localhost)
	by mail.chesapeake.net (8.11.6/8.11.6) with ESMTP id h2SKMh235468;
	Fri, 28 Mar 2003 15:22:43 -0500 (EST)
	(envelope-from jroberson@chesapeake.net)
Date: Fri, 28 Mar 2003 15:22:43 -0500 (EST)
From: Jeff Roberson <jroberson@chesapeake.net>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
In-Reply-To: <Pine.GSO.4.10.10303281352530.242-100000@pcnet1.pcnet.com>
Message-ID: <20030328151526.S64602-100000@mail.chesapeake.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-17.1 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 20:22:52 -0000

On Fri, 28 Mar 2003, Daniel Eischen wrote:

> On Fri, 28 Mar 2003, Daniel C. Sobral wrote:
>
> > David Xu wrote:
> > >
> > > do you think that a multithreaded process should use more CPU time then
> > > a single thread process, so threaded process should have higher priority
> > > and block other single thread processes out? AFAIK, threading is not
> > > designed for this, you may misunderstand what threading is designed for.
> >
> > Threading might not have been originally designed for this, but a lot of
> > people use it this way, a lot of people *want* it this way, and POSIX
> > specifically mandates that this way be available.
>
> It is available through pthread_attr_setscope().
>
> There's some confusion over this and the way libthr is implemented.
> KSE's within the same KSE Group were not designed to give more CPU
> time than a normal unthreaded/single KSE'd process.  Unless this
> has been changed in the kernel somehow, the use of multiple KSEs
> by libthr or libkse (in a single KSEG) will not get any more CPU
> time than a non-threaded program.  There was some debate over
> this, but multiple KSEs within a KSEG were _not_ suppose to allow
> this.  You are suppose to create a new KSEG in order to get
> this behavior.
>

This is not how it is implemented in either scheduler that we currently
have.  I'm not saying which way is more or less correct because I think
you could argue either way.  We can not entirely correctly implement
SCOPE_PROCESSES threads right now anyway.

This being said..  It is a property of the thr system calls and not
libthr.  I have a flags field in thr_create() that could be used to
indicate which scope the thread should contend in.

Cheers,
Jeff

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 12:34:49 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id CED4837B404
	for <arch@freebsd.org>; Fri, 28 Mar 2003 12:34:49 -0800 (PST)
Received: from mail.pcnet.com (mail.pcnet.com [204.213.232.4])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1A81B43FCB
	for <arch@freebsd.org>; Fri, 28 Mar 2003 12:34:49 -0800 (PST)
	(envelope-from eischen@pcnet1.pcnet.com)
Received: from pcnet1.pcnet.com (localhost [127.0.0.1])
	by mail.pcnet.com (8.12.8/8.12.1) with ESMTP id h2SKYiBg018345;
	Fri, 28 Mar 2003 15:34:44 -0500 (EST)
Received: from localhost (eischen@localhost)h2SKYiKC018341;
	Fri, 28 Mar 2003 15:34:44 -0500 (EST)
Date: Fri, 28 Mar 2003 15:34:44 -0500 (EST)
From: Daniel Eischen <eischen@pcnet1.pcnet.com>
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030328151526.S64602-100000@mail.chesapeake.net>
Message-ID: <Pine.GSO.4.10.10303281526180.16659-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.4 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      REPLY_WITH_QUOTES,USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 20:34:51 -0000

On Fri, 28 Mar 2003, Jeff Roberson wrote:

> On Fri, 28 Mar 2003, Daniel Eischen wrote:
> 
> > On Fri, 28 Mar 2003, Daniel C. Sobral wrote:
> >
> > > David Xu wrote:
> > > >
> > > > do you think that a multithreaded process should use more CPU time then
> > > > a single thread process, so threaded process should have higher priority
> > > > and block other single thread processes out? AFAIK, threading is not
> > > > designed for this, you may misunderstand what threading is designed for.
> > >
> > > Threading might not have been originally designed for this, but a lot of
> > > people use it this way, a lot of people *want* it this way, and POSIX
> > > specifically mandates that this way be available.
> >
> > It is available through pthread_attr_setscope().
> >
> > There's some confusion over this and the way libthr is implemented.
> > KSE's within the same KSE Group were not designed to give more CPU
> > time than a normal unthreaded/single KSE'd process.  Unless this
> > has been changed in the kernel somehow, the use of multiple KSEs
> > by libthr or libkse (in a single KSEG) will not get any more CPU
> > time than a non-threaded program.  There was some debate over
> > this, but multiple KSEs within a KSEG were _not_ suppose to allow
> > this.  You are suppose to create a new KSEG in order to get
> > this behavior.
> >
> 
> This is not how it is implemented in either scheduler that we currently
> have.  I'm not saying which way is more or less correct because I think
> you could argue either way.  We can not entirely correctly implement
> SCOPE_PROCESSES threads right now anyway.

Well, since we have KSEGs, I'd argue that this is a bug.
Perhaps it was too difficult to do this and no-one thought
you'd ever allow more KSEs in a KSEG than you have CPUs,
so that became the limiting factor.

> This being said..  It is a property of the thr system calls and not
> libthr.  I have a flags field in thr_create() that could be used to
> indicate which scope the thread should contend in.

BTW, I'm not arguing about libthr implementation here.  I'm
just stating what a KSE is (was) suppose to be (which implicitly
describes libthr and libkse behavior).

-- 
Dan Eischen

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 12:57:15 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A78FD37B401
	for <arch@freebsd.org>; Fri, 28 Mar 2003 12:57:15 -0800 (PST)
Received: from sccrmhc01.attbi.com (sccrmhc01.attbi.com [204.127.202.61])
	by mx1.FreeBSD.org (Postfix) with ESMTP id E1C3B43FCB
	for <arch@freebsd.org>; Fri, 28 Mar 2003 12:57:14 -0800 (PST)
	(envelope-from julian@elischer.org)
Received: from interjet.elischer.org
	(12-232-168-4.client.attbi.com[12.232.168.4])
	by sccrmhc01.attbi.com (sccrmhc01) with ESMTP
	id <2003032820571300100cu49te>; Fri, 28 Mar 2003 20:57:14 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA71665;
	Fri, 28 Mar 2003 12:57:12 -0800 (PST)
Date: Fri, 28 Mar 2003 12:57:10 -0800 (PST)
From: Julian Elischer <julian@elischer.org>
To: Daniel Eischen <eischen@pcnet1.pcnet.com>
In-Reply-To: <Pine.GSO.4.10.10303281526180.16659-100000@pcnet1.pcnet.com>
Message-ID: <Pine.BSF.4.21.0303281253080.52134-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
X-Spam-Status: No, hits=-25.0 required=5.0
	tests=AWL,EMAIL_ATTRIBUTION,IN_REP_TO,QUOTED_EMAIL_TEXT,
	      QUOTE_TWICE_1,RCVD_IN_UNCONFIRMED_DSBL,REPLY_WITH_QUOTES,
	      USER_AGENT_PINE
	autolearn=ham	version=2.50
X-Spam-Level: 
X-Spam-Checker-Version: SpamAssassin 2.50 (1.173-2003-02-20-exp)
cc: Scott Long <scott_long@btc.adaptec.com>
cc: arch@freebsd.org
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Fri, 28 Mar 2003 20:57:17 -0000


On Fri, 28 Mar 2003, Daniel Eischen wrote:

> On Fri, 28 Mar 2003, Jeff Roberson wrote:
> 
> > On Fri, 28 Mar 2003, Daniel Eischen wrote:
> > 
> > > On Fri, 28 Mar 2003, Daniel C. Sobral wrote:
> > >
> > > > David Xu wrote:
> > > > >
> > > > > do you think that a multithreaded process should use more CPU time then
> > > > > a single thread process, so threaded process should have higher priority
> > > > > and block other single thread processes out? AFAIK, threading is not
> > > > > designed for this, you may misunderstand what threading is designed for.
> > > >
> > > > Threading might not have been originally designed for this, but a lot of
> > > > people use it this way, a lot of people *want* it this way, and POSIX
> > > > specifically mandates that this way be available.
> > >
> > > It is available through pthread_attr_setscope().
> > >
> > > There's some confusion over this and the way libthr is implemented.
> > > KSE's within the same KSE Group were not designed to give more CPU
> > > time than a normal unthreaded/single KSE'd process.  Unless this
> > > has been changed in the kernel somehow, the use of multiple KSEs
> > > by libthr or libkse (in a single KSEG) will not get any more CPU
> > > time than a non-threaded program.  There was some debate over
> > > this, but multiple KSEs within a KSEG were _not_ suppose to allow
> > > this.  You are suppose to create a new KSEG in order to get
> > > this behavior.
> > >
> > 
> > This is not how it is implemented in either scheduler that we currently
> > have.  I'm not saying which way is more or less correct because I think
> > you could argue either way.  We can not entirely correctly implement
> > SCOPE_PROCESSES threads right now anyway.
> 
> Well, since we have KSEGs, I'd argue that this is a bug.
> Perhaps it was too difficult to do this and no-one thought
> you'd ever allow more KSEs in a KSEG than you have CPUs,
> so that became the limiting factor.
> 
> > This being said..  It is a property of the thr system calls and not
> > libthr.  I have a flags field in thr_create() that could be used to
> > indicate which scope the thread should contend in.
> 
> BTW, I'm not arguing about libthr implementation here.  I'm
> just stating what a KSE is (was) suppose to be (which implicitly
> describes libthr and libkse behavior).

I'm happy to see the limit of (NKSEs !> NCPU) lifted for processes that
are in some way identified as 1:1 mode processes..
I don't want to lift it for KSE mode processes however.

For system scope threads, I guess you just allocate a separate KSEGRP
so it has somewhere to store pertinent info.

that makes it rather simple
system scope threads have a thread, a KSE and a KSEGRP
process scope threads just use the existing KSEGRP.

Everythiong should just "fall out correctly" by doing this..


> 
> -- 
> Dan Eischen
> 
> _______________________________________________
> freebsd-arch@freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org"
> 

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 20:26:49 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9847E37B401
	for <arch@freebsd.org>; Fri, 28 Mar 2003 20:26:49 -0800 (PST)
Received: from canning.wemm.org (canning.wemm.org [192.203.228.65])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4166643FA3
	for <arch@freebsd.org>; Fri, 28 Mar 2003 20:26:49 -0800 (PST)
	(envelope-from peter@wemm.org)
Received: from wemm.org (localhost [127.0.0.1])
	by canning.wemm.org (Postfix) with ESMTP
	id 18B682A8BB; Fri, 28 Mar 2003 20:26:49 -0800 (PST)
	(envelope-from peter@wemm.org)
X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4
To: Jeff Roberson <jroberson@chesapeake.net>
In-Reply-To: <20030327143259.I64602-100000@mail.chesapeake.net> 
Date: Fri, 28 Mar 2003 20:26:49 -0800
From: Peter Wemm <peter@wemm.org>
Message-Id: <20030329042649.18B682A8BB@canning.wemm.org>
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
cc: Daniel Eischen <eischen@pcnet1.pcnet.com>
Subject: Re: 1:1 threading. 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 04:26:51 -0000

Jeff Roberson wrote:
> On Thu, 27 Mar 2003, Daniel Eischen wrote:
> 
> > On Thu, 27 Mar 2003, Scott Long wrote:
> > > Once 5-STABLE happens, users of 5.x can no longer be guinea pigs for KSE
> > > development.  By keeping the 1:1 and M:N API's separate, KSE can
> > > progress in 6-CURRENT until it is proven while still allowing MFC's to
> > > 5-STABLE to happen without too much pain.
> >
> > That's kind of silly; we have other ways to keep API/ABI
> > compatability and have used this for all other syscalls.
> > The KSE and thread mailboxes even have version numbers
> > in them.
> 
> Which means they are likely to change.  I do not want to develop on
> unstable APIs and unstable kernel code.  kern_thr.c is 254 lines.  I think
> we can handle a little duplication.  I'm not sure why the objection is so
> strong.

I for one think they should use seperate syscalls.  We shouldn't have
designed-for-KSE mailboxes going anywhere near this stuff and it gives the
KSE folks plenty of room to keep tweaking their data structures. 

Anyway, I can't wait to see how this works out.  It is becoming a Big Deal
at work, we're using the linuxthreads port + rfork() out of desperation.
libthr can't possibly be any nastier than that.

Cheers,
-Peter
--
Peter Wemm - peter@wemm.org; peter@FreeBSD.org; peter@yahoo-inc.com
"All of this is for nothing if we don't go to the stars" - JMS/B5

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 20:46:31 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id B07D937B401; Fri, 28 Mar 2003 20:46:31 -0800 (PST)
Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net
	[207.217.120.139])	by mx1.FreeBSD.org (Postfix) with ESMTP
	id EC94443F3F; Fri, 28 Mar 2003 20:46:30 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0191.cvx21-bradley.dialup.earthlink.net ([209.179.192.191]
	helo=mindspring.com)
	by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18z8ET-0003HI-00; Fri, 28 Mar 2003 20:46:18 -0800
Message-ID: <3E8524C0.5F80D3D@mindspring.com>
Date: Fri, 28 Mar 2003 20:44:48 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Daniel C. Sobral" <dcs@tcoip.com.br>
References: <20030327143259.I64602-100000@mail.chesapeake.net>
	<3E843009.2060104@tcoip.com.br>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4d6b804bc6589c5b91231a495bc3d307ba2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 04:46:33 -0000

"Daniel C. Sobral" wrote:
> David Xu wrote:
> > do you think that a multithreaded process should use more CPU time then
> > a single thread process, so threaded process should have higher priority
> > and block other single thread processes out? AFAIK, threading is not
> > designed for this, you may misunderstand what threading is designed for.
> 
> Threading might not have been originally designed for this, but a lot of
> people use it this way, a lot of people *want* it this way, and POSIX
> specifically mandates that this way be available.
> 
> So let's drop that issue, please.

A side question...

Is there an administrative limit on the number of threads that
you can create in a process, such that the total number is
limited to the number of processes you are administratively
limited to creating?

I.e., the administrative limit on number of child processes is
implicitly an administrative limit on how much quantum you can
use; is the limit still enforced on threads, as well?

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Fri Mar 28 21:11:46 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 0FFE837B401
	for <arch@freebsd.org>; Fri, 28 Mar 2003 21:11:46 -0800 (PST)
Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net
	[207.217.120.139])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 869A643F93
	for <arch@freebsd.org>; Fri, 28 Mar 2003 21:11:45 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0191.cvx21-bradley.dialup.earthlink.net ([209.179.192.191]
	helo=mindspring.com)
	by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18z8d0-0005nc-00; Fri, 28 Mar 2003 21:11:39 -0800
Message-ID: <3E852ABD.E77EA566@mindspring.com>
Date: Fri, 28 Mar 2003 21:10:21 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Julian Elischer <julian@elischer.org>
References: <Pine.BSF.4.21.0303281253080.52134-100000@InterJet.elischer.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a46d4fb99f91a96aa48d9b9983a5122a09a7ce0e8f8d31aa3f350badd9bab72f9c350badd9bab72f9c
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
cc: Daniel Eischen <eischen@pcnet1.pcnet.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 05:11:47 -0000

Julian Elischer wrote:
> I'm happy to see the limit of (NKSEs !> NCPU) lifted for processes that
> are in some way identified as 1:1 mode processes..
> I don't want to lift it for KSE mode processes however.
> 
> For system scope threads, I guess you just allocate a separate KSEGRP
> so it has somewhere to store pertinent info.
> 
> that makes it rather simple
> system scope threads have a thread, a KSE and a KSEGRP
> process scope threads just use the existing KSEGRP.
> 
> Everythiong should just "fall out correctly" by doing this..

Except that means for process scope threads, you don't get SMP
scalability, since the single KSEGRP binds them all to a single
CPU... right?

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 08:50:51 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 173D137B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 08:50:51 -0800 (PST)
Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 8426143F85
	for <arch@freebsd.org>; Sat, 29 Mar 2003 08:50:50 -0800 (PST)
	(envelope-from des@ofug.org)
Received: by flood.ping.uio.no (Postfix, from userid 2602)
	id 0E8EC5308; Sat, 29 Mar 2003 17:50:48 +0100 (CET)
X-URL: http://www.ofug.org/~des/
X-Disclaimer: The views expressed in this message do not necessarily
  coincide with those of any organisation or company with
  which I am or have been affiliated.
To: arch@freebsd.org
From: des@ofug.org (Dag-Erling =?iso-8859-1?q?Sm=F8rgrav?=)
Date: Sat, 29 Mar 2003 17:50:46 +0100
Message-ID: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
User-Agent: Gnus/5.090015 (Oort Gnus v0.15) Emacs/21.2
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="=-=-="
Subject: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 16:50:54 -0000

--=-=-=
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

The attached patch, inspired by a discussion on -STABLE, modifies our
resolver library to allow underscores in host names, by classifying
the underscore as a hyphen character.  Even though RFC952 forbids
them, underscores are becoming increasingly common in DNS, and they
are sometimes used for mechanisms (such as Microsoft's automatic proxy
configuration scheme) which we might want to support in FreeBSD.

DES
--=20
Dag-Erling Sm=F8rgrav - des@ofug.org


--=-=-=
Content-Type: text/x-patch
Content-Disposition: attachment; filename=hnok.diff

Index: lib/libc/net/res_comp.c
===================================================================
RCS file: /home/ncvs/src/lib/libc/net/res_comp.c,v
retrieving revision 1.17
diff -u -r1.17 res_comp.c
--- lib/libc/net/res_comp.c	22 Mar 2002 21:52:29 -0000	1.17
+++ lib/libc/net/res_comp.c	29 Mar 2003 16:42:57 -0000
@@ -142,7 +142,7 @@
  * is not careful about this, but for some reason, we're doing it right here.
  */
 #define PERIOD 0x2e
-#define	hyphenchar(c) ((c) == 0x2d)
+#define	hyphenchar(c) ((c) == 0x2d || (c) == 0x5f)
 #define bslashchar(c) ((c) == 0x5c)
 #define periodchar(c) ((c) == PERIOD)
 #define asterchar(c) ((c) == 0x2a)

--=-=-=--

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 12:41:06 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 7448B37B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 12:41:06 -0800 (PST)
Received: from dan.emsphone.com (dan.emsphone.com [199.67.51.101])
	by mx1.FreeBSD.org (Postfix) with ESMTP id BF45843F3F
	for <arch@freebsd.org>; Sat, 29 Mar 2003 12:41:05 -0800 (PST)
	(envelope-from dan@dan.emsphone.com)
Received: (from dan@localhost)
	by dan.emsphone.com (8.12.7/8.12.7) id h2TKf4Qw027665;
	Sat, 29 Mar 2003 14:41:04 -0600 (CST)
	(envelope-from dan)
Date: Sat, 29 Mar 2003 14:41:04 -0600
From: Dan Nelson <dnelson@allantgroup.com>
To: Dag-Erling Smorgrav <des@ofug.org>
Message-ID: <20030329204104.GF74971@dan.emsphone.com>
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
X-OS: FreeBSD 5.0-CURRENT
X-message-flag: Outlook Error
User-Agent: Mutt/1.5.4i
cc: arch@freebsd.org
Subject: Re: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 20:41:31 -0000

In the last episode (Mar 29), Dag-Erling Smorgrav said:
> The attached patch, inspired by a discussion on -STABLE, modifies our
> resolver library to allow underscores in host names, by classifying
> the underscore as a hyphen character.  Even though RFC952 forbids
> them, underscores are becoming increasingly common in DNS, and they
> are sometimes used for mechanisms (such as Microsoft's automatic proxy
> configuration scheme) which we might want to support in FreeBSD.

I thought proxy autodetect used wpad.domainname.com or looked up
http://domainname.com/wpad.dat ?  All the XP machines here do that.

-- 
	Dan Nelson
	dnelson@allantgroup.com

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 12:50:52 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 9484F37B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 12:50:52 -0800 (PST)
Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 1494A43F93
	for <arch@freebsd.org>; Sat, 29 Mar 2003 12:50:52 -0800 (PST)
	(envelope-from julian@elischer.org)
Received: from interjet.elischer.org
	(12-232-168-4.client.attbi.com[12.232.168.4])
	by rwcrmhc51.attbi.com (rwcrmhc51) with ESMTP
	id <20030329205051051000o6dne>; Sat, 29 Mar 2003 20:50:51 +0000
Received: from localhost (localhost.elischer.org [127.0.0.1])
	by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id MAA80861;
	Sat, 29 Mar 2003 12:50:48 -0800 (PST)
Date: Sat, 29 Mar 2003 12:50:46 -0800 (PST)
From: Julian Elischer <julian@elischer.org>
To: Terry Lambert <tlambert2@mindspring.com>
In-Reply-To: <3E852ABD.E77EA566@mindspring.com>
Message-ID: <Pine.BSF.4.21.0303291250360.80824-100000@InterJet.elischer.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: arch@freebsd.org
cc: Scott Long <scott_long@btc.adaptec.com>
cc: Daniel Eischen <eischen@pcnet1.pcnet.com>
Subject: Re: 1:1 threading.
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 20:50:55 -0000


On Fri, 28 Mar 2003, Terry Lambert wrote:

> Julian Elischer wrote:
> > I'm happy to see the limit of (NKSEs !> NCPU) lifted for processes that
> > are in some way identified as 1:1 mode processes..
> > I don't want to lift it for KSE mode processes however.
> > 
> > For system scope threads, I guess you just allocate a separate KSEGRP
> > so it has somewhere to store pertinent info.
> > 
> > that makes it rather simple
> > system scope threads have a thread, a KSE and a KSEGRP
> > process scope threads just use the existing KSEGRP.
> > 
> > Everythiong should just "fall out correctly" by doing this..
> 
> Except that means for process scope threads, you don't get SMP
> scalability, since the single KSEGRP binds them all to a single
> CPU... right?


no

> 
> -- Terry
> 

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 14:02:44 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP
	id B119237B401; Sat, 29 Mar 2003 14:02:44 -0800 (PST)
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by mx1.FreeBSD.org (Postfix) with ESMTP
	id C431A43FAF; Sat, 29 Mar 2003 14:02:43 -0800 (PST)
	(envelope-from phk@phk.freebsd.dk)
Received: from critter.freebsd.dk (localhost [127.0.0.1])
	by critter.freebsd.dk (8.12.8/8.12.8) with ESMTP id h2TM2YSM010262;
	Sat, 29 Mar 2003 23:02:38 +0100 (CET)
	(envelope-from phk@phk.freebsd.dk)
To: Wes Peters <wes@softweyr.com>
From: "Poul-Henning Kamp" <phk@phk.freebsd.dk>
In-Reply-To: Your message of "Fri, 28 Mar 2003 09:10:32 PST."
             <200303280910.32307.wes@softweyr.com> 
Date: Sat, 29 Mar 2003 23:02:34 +0100
Message-ID: <10261.1048975354@critter.freebsd.dk>
cc: David Schultz <das@FreeBSD.ORG>
cc: freebsd-arch@FreeBSD.ORG
Subject: Re: Patch to protect process from pageout killing 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 22:02:45 -0000

In message <200303280910.32307.wes@softweyr.com>, Wes Peters writes:

>I've reworked my patch to use the madvise(2) syscall, like the original 
>4.x patch did.  I've even documented it, in a man page of all places.  
>Please see attached patch.  If nobody objects, I'll commit sometime this 
>weekend.

I'm still not certain about the inheritance of this, do we want/is it
inherited ?

Also, thinking about it, on at least a handful of machines I would
have more use for MADV_KILLMEFIRST having the exact opposite
behaviour.

-- 
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 15:34:40 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id BA21837B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 15:34:40 -0800 (PST)
Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66])
	by mx1.FreeBSD.org (Postfix) with ESMTP id CCEB443FE5
	for <arch@freebsd.org>; Sat, 29 Mar 2003 15:34:39 -0800 (PST)
	(envelope-from imp@bsdimp.com)
Received: from localhost (warner@rover2.village.org [10.0.0.1])
	by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2TNYbA7016298
	for <arch@freebsd.org>; Sat, 29 Mar 2003 16:34:37 -0700 (MST)
	(envelope-from imp@bsdimp.com)
Date: Sat, 29 Mar 2003 16:33:43 -0700 (MST)
Message-Id: <20030329.163343.53040416.imp@bsdimp.com>
To: arch@freebsd.org
From: "M. Warner Losh" <imp@bsdimp.com>
X-Mailer: Mew version 2.1 on Emacs 21.2 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Subject: depend + all vs dependall
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 23:34:42 -0000

NetBSD created a dependall target some time ago.  This target does a
make depend and then a make all so they only have to traverse the tree
once for these two stages rather than twice.  The time of a buildworld
came up in a discussion recently and I thought I'd see how hard it
would be to do something similar in FreeBSD.  Here are my preliminary
results.

Machine: Dell Inspiron 8000, 256M RAM, P3-700
time make buildworld
(2:04:34 wall time, didn't save the actual output :-(.

Machine: Dual Athlon XP2000+ 1.5G RAM aac controller.

time make buildworld -j 8 -s

run 0: did the above to 'flush the caches/load the sources in ram'

Pre-change:

Run 1:
1941.458u 723.640s 32:23.67 137.1%      2747+2215k 1447+145802io 465pf+0w
Run 2:
1942.160u 729.972s 31:45.84 140.2%      2748+2212k 1423+145755io 465pf+0w

After Changes:

Run 1:
1922.767u 723.847s 30:48.64 143.1%      2785+2201k 1312+148256io 465pf+0w
Run 2:
1922.661u 725.477s 30:49.99 143.1%      2788+2201k 1378+148489io 465pf+0w

So it looks like it saves a little over a minute out of 32 (1925s
average vs 1849s average, or almost a 4% reduction) on my big build
box.

My only concern with the patches is that they might interact badly
with a bug I remember from the FreeBSD 1.1R days, but can't reproduce,
in make.  Once upon a time, 'make depend all' was different than 'make
depend && make all' because the .depend files weren't re-read after
the depend phase, but before the all phase, whereas two makes this
would be the case.  Since this change combines the two, I'm a little
worried about that.  Is that still a bug in FreeBSD's make?  It won't
matter for a pure, virgin tree, but might for incremental builds...

Comments?

Warner

http://perforce.freebsd.org/chv.cgi?CH=27577

Change 27577 by imp@imp_hammer on 2003/03/29 11:24:15

	create a new dependall target.
	# I don't know if the ancient bug about depend is fixed or not.

Affected files ...

.. //depot/user/imp/freebsd-imp/Makefile#14 edit
.. //depot/user/imp/freebsd-imp/Makefile.inc1#18 edit
.. //depot/user/imp/freebsd-imp/share/mk/bsd.README#3 edit
.. //depot/user/imp/freebsd-imp/share/mk/bsd.dep.mk#3 edit
.. //depot/user/imp/freebsd-imp/share/mk/bsd.subdir.mk#2 edit

Differences ...

==== //depot/user/imp/freebsd-imp/Makefile#14 (text+ko) ====

@@ -89,8 +89,8 @@
 # order, but that's not important.
 #
 TGTS=	all all-man buildkernel buildtools buildworld checkdpadd clean \
-	cleandepend cleandir depend distribute distributeworld everything \
-	hierarchy install installcheck installkernel \
+	cleandepend cleandir depend dependall distribute distributeworld \
+	everything hierarchy install installcheck installkernel \
 	reinstallkernel installmost installworld libraries lint maninstall \
 	mk most obj objlink regress rerelease tags update
 
@@ -189,8 +189,7 @@
 	@echo "--------------------------------------------------------------"
 	@cd ${.CURDIR}/usr.bin/make; \
 		${MMAKE} obj && \
-		${MMAKE} depend && \
-		${MMAKE} all && \
+		${MMAKE} dependall && \
 		${MMAKE} install DESTDIR=${MAKEPATH} BINDIR=
 
 #

==== //depot/user/imp/freebsd-imp/Makefile.inc1#18 (text+ko) ====

@@ -32,7 +32,7 @@
 #
 # Standard targets (not defined here) are documented in the makefiles in
 # /usr/share/mk.  These include:
-#		obj depend all install clean cleandepend cleanobj
+#		obj depend dependall all install clean cleandepend cleanobj
 
 # Put initial settings here.
 SUBDIR=
@@ -319,18 +319,12 @@
 	@echo ">>> stage 4: building libraries"
 	@echo "--------------------------------------------------------------"
 	cd ${.CURDIR}; ${WMAKE} -DNOHTML -DNOINFO -DNOMAN -DNOFSCHG libraries
-_depend:
-	@echo
-	@echo "--------------------------------------------------------------"
-	@echo ">>> stage 4: make dependencies"
-	@echo "--------------------------------------------------------------"
-	cd ${.CURDIR}; ${WMAKE} par-depend
 everything:
 	@echo
 	@echo "--------------------------------------------------------------"
 	@echo ">>> stage 4: building everything.."
 	@echo "--------------------------------------------------------------"
-	cd ${.CURDIR}; ${WMAKE} all
+	cd ${.CURDIR}; ${WMAKE} dependall
 
 
 WMAKE_TOOL_TGTS=
@@ -341,7 +335,7 @@
 .if !defined(SUBDIR_OVERRIDE)
 WMAKE_TOOL_TGTS+=	_cross-tools
 .endif
-WMAKE_TGTS=	${WMAKE_TOOL_TGTS} _includes _libraries _depend everything
+WMAKE_TGTS=	${WMAKE_TOOL_TGTS} _includes _libraries everything
 
 buildworld: ${WMAKE_TGTS}
 .ORDER: ${WMAKE_TGTS}
@@ -501,7 +495,7 @@
 	    ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} obj
 # XXX - Gratuitously builds aicasm in the ``makeoptions NO_MODULES'' case.
 .if !defined(MODULES_WITH_WORLD) && !defined(NO_MODULES) && exists(${KRNLSRCDIR}/modules)
-.for target in obj depend all
+.for target in obj dependall
 	cd ${.CURDIR}/sys/modules/aic7xxx/aicasm; \
 	    MAKEOBJDIRPREFIX=${KRNLOBJDIR}/${_kernel}/modules \
 	    ${MAKE} -DNO_CPU_CFLAGS ${target}
@@ -509,10 +503,11 @@
 .endif
 .if !defined(NO_KERNELDEPEND)
 	cd ${KRNLOBJDIR}/${_kernel}; \
-	    ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} depend -DNO_MODULES_OBJ
-.endif
+	    ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} dependall -DNO_MODULES_OBJ
+.else
 	cd ${KRNLOBJDIR}/${_kernel}; \
 	    ${KMAKEENV} ${MAKE} KERNEL=${INSTKERNNAME} all -DNO_MODULES_OBJ
+.endif
 	@echo "--------------------------------------------------------------"
 	@echo ">>> Kernel build for ${_kernel} completed on `LC_ALL=C date`"
 	@echo "--------------------------------------------------------------"
@@ -620,8 +615,7 @@
 	${ECHODIR} "===> ${_tool}"; \
 		cd ${.CURDIR}/${_tool}; \
 		${MAKE} DIRPRFX=${_tool}/ obj; \
-		${MAKE} DIRPRFX=${_tool}/ depend; \
-		${MAKE} DIRPRFX=${_tool}/ all; \
+		${MAKE} DIRPRFX=${_tool}/ dependall; \
 		${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX} install
 .endfor
 
@@ -681,8 +675,7 @@
 	${ECHODIR} "===> ${_tool}"; \
 		cd ${.CURDIR}/${_tool}; \
 		${MAKE} DIRPRFX=${_tool}/ obj; \
-		${MAKE} DIRPRFX=${_tool}/ depend; \
-		${MAKE} DIRPRFX=${_tool}/ all; \
+		${MAKE} DIRPRFX=${_tool}/ dependall; \
 		${MAKE} DIRPRFX=${_tool}/ DESTDIR=${MAKEOBJDIRPREFIX} install
 .endfor
 
@@ -762,8 +755,7 @@
 .if exists(${.CURDIR}/${_lib})
 	${ECHODIR} "===> ${_lib}"; \
 		cd ${.CURDIR}/${_lib}; \
-		${MAKE} DIRPRFX=${_lib}/ depend; \
-		${MAKE} DIRPRFX=${_lib}/ all; \
+		${MAKE} DIRPRFX=${_lib}/ dependall; \
 		${MAKE} DIRPRFX=${_lib}/ install
 .endif
 .endfor
@@ -782,7 +774,7 @@
 _prebuild_libs: ${_prebuild_libs:S/$/__L/}
 _generic_libs: ${_generic_libs:S/$/__L/}
 
-.for __target in clean cleandepend cleandir depend includes obj
+.for __target in clean cleandepend cleandir depend dependall includes obj
 .for entry in ${SUBDIR}
 ${entry}.${__target}__D: .PHONY
 	@if test -d ${.CURDIR}/${entry}.${MACHINE_ARCH}; then \

==== //depot/user/imp/freebsd-imp/share/mk/bsd.README#3 (text+ko) ====

@@ -169,6 +169,8 @@
 	depend:
 		make the dependencies for the source files, and store
 		them in the file .depend.
+	dependall:
+		make depend then make all
 	install:
 		install the program and its manual pages; if the Makefile
 		does not itself define the target install, the targets

==== //depot/user/imp/freebsd-imp/share/mk/bsd.dep.mk#3 (text+ko) ====

@@ -31,6 +31,9 @@
 #		Make the dependencies for the source files, and store
 #		them in the file ${DEPENDFILE}.
 #
+#	dependall:
+#		make depend and then all
+#
 #	tags:
 #		In "ctags" mode, create a tags file for the source files.
 #		In "gtags" mode, create a (GLOBAL) gtags file for the
@@ -183,3 +186,7 @@
 		echo "LDADD -> $$ldadd1" ; \
 	fi
 .endif
+
+.PHONY: dependall
+.ORDER: afterdepend all
+dependall: depend all

==== //depot/user/imp/freebsd-imp/share/mk/bsd.subdir.mk#2 (text+ko) ====

@@ -25,8 +25,8 @@
 # 		put the stuff into the right "distribution".
 #
 #	afterinstall, all, all-man, beforeinstall, checkdpadd,
-#	clean, cleandepend, cleandir, depend, install, lint, maninstall,
-#	obj, objlink, realinstall, regress, tags
+#	clean, cleandepend, cleandir, depend, dependall, install, lint,
+#	maninstall, obj, objlink, realinstall, regress, tags
 #
 
 .include <bsd.init.mk>
@@ -67,7 +67,7 @@
 
 
 .for __target in all all-man checkdpadd clean cleandepend cleandir \
-    depend distribute lint maninstall \
+    depend dependall distribute lint maninstall \
     obj objlink realinstall regress tags
 ${__target}: _SUBDIR
 .endfor


From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 15:45:00 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 78F7437B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 15:45:00 -0800 (PST)
Received: from harmony.village.org (rover.bsdimp.com [204.144.255.66])
	by mx1.FreeBSD.org (Postfix) with ESMTP id A32D043F75
	for <arch@freebsd.org>; Sat, 29 Mar 2003 15:44:59 -0800 (PST)
	(envelope-from imp@bsdimp.com)
Received: from localhost (warner@rover2.village.org [10.0.0.1])
	by harmony.village.org (8.12.8/8.12.3) with ESMTP id h2TNivA7016348;
	Sat, 29 Mar 2003 16:44:58 -0700 (MST)
	(envelope-from imp@bsdimp.com)
Date: Sat, 29 Mar 2003 16:44:03 -0700 (MST)
Message-Id: <20030329.164403.54601077.imp@bsdimp.com>
To: des@ofug.org
From: "M. Warner Losh" <imp@bsdimp.com>
In-Reply-To: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
X-Mailer: Mew version 2.1 on Emacs 21.2 / Mule 5.0 (SAKAKI)
Mime-Version: 1.0
Content-Type: Text/Plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
cc: arch@freebsd.org
Subject: Re: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sat, 29 Mar 2003 23:45:03 -0000

When this has come up in the past, it was decreed that _ is a bad bad
bad bad idea, even though people want it.  You might want to check the
ancient archives (1998?) for all the reasons why.

Warner

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 16:18:41 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C2D1C37B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 16:18:41 -0800 (PST)
Received: from eden.barryp.org (host-150-32-220-24.midco.net [24.220.32.150])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 0C7B843FA3
	for <arch@freebsd.org>; Sat, 29 Mar 2003 16:18:41 -0800 (PST)
	(envelope-from bp@barryp.org)
Received: from [10.66.0.248] (helo=barryp.org)
	by eden.barryp.org with esmtp (Exim 4.10)
	id 18zQX0-0004qn-00
	for arch@freebsd.org; Sat, 29 Mar 2003 18:18:38 -0600
Message-ID: <3E8637DE.3080003@barryp.org>
Date: Sat, 29 Mar 2003 18:18:38 -0600
From: Barry Pederson <bp@barryp.org>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US;
	rv:1.3) Gecko/20030312
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: arch@freebsd.org
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
	<20030329204104.GF74971@dan.emsphone.com>
In-Reply-To: <20030329204104.GF74971@dan.emsphone.com>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
X-SpamTrack: NO 62
Subject: Re: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 00:18:43 -0000

Dan Nelson wrote:
> In the last episode (Mar 29), Dag-Erling Smorgrav said:
> 
>>The attached patch, inspired by a discussion on -STABLE, modifies our
>>resolver library to allow underscores in host names, by classifying
>>the underscore as a hyphen character.  Even though RFC952 forbids
>>them, underscores are becoming increasingly common in DNS, and they
>>are sometimes used for mechanisms (such as Microsoft's automatic proxy
>>configuration scheme) which we might want to support in FreeBSD.
> 
> 
> I thought proxy autodetect used wpad.domainname.com or looked up
> http://domainname.com/wpad.dat ?  All the XP machines here do that.

The underscore in DNS names is showing up in things like RFC2872 (A DNS RR 
for specifying the location of services), and "DNS-based Service discovery" 
as found in Zeroconf/Rendezvous.

     Barry

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 16:24:49 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 99C2137B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 16:24:49 -0800 (PST)
Received: from obsecurity.dyndns.org
	(adsl-63-207-60-150.dsl.lsan03.pacbell.net [63.207.60.150])
	by mx1.FreeBSD.org (Postfix) with ESMTP id F1C2343F75
	for <arch@freebsd.org>; Sat, 29 Mar 2003 16:24:48 -0800 (PST)
	(envelope-from kris@obsecurity.org)
Received: from rot13.obsecurity.org (rot13.obsecurity.org [10.0.0.5])
	by obsecurity.dyndns.org (Postfix) with ESMTP
	id BACFA66E05; Sat, 29 Mar 2003 16:24:48 -0800 (PST)
Received: by rot13.obsecurity.org (Postfix, from userid 1000)
	id 9F9771298; Sat, 29 Mar 2003 16:24:48 -0800 (PST)
Date: Sat, 29 Mar 2003 16:24:48 -0800
From: Kris Kennaway <kris@obsecurity.org>
To: "M. Warner Losh" <imp@bsdimp.com>
Message-ID: <20030330002448.GA32150@rot13.obsecurity.org>
References: <20030329.163343.53040416.imp@bsdimp.com>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha1;
	protocol="application/pgp-signature"; boundary="4Ckj6UjgE2iN1+kY"
Content-Disposition: inline
In-Reply-To: <20030329.163343.53040416.imp@bsdimp.com>
User-Agent: Mutt/1.4i
cc: arch@freebsd.org
Subject: Re: depend + all vs dependall
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 00:24:50 -0000


--4Ckj6UjgE2iN1+kY
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

On Sat, Mar 29, 2003 at 04:33:43PM -0700, M. Warner Losh wrote:

> My only concern with the patches is that they might interact badly
> with a bug I remember from the FreeBSD 1.1R days, but can't reproduce,
> in make.  Once upon a time, 'make depend all' was different than 'make
> depend && make all' because the .depend files weren't re-read after
> the depend phase, but before the all phase, whereas two makes this
> would be the case.  Since this change combines the two, I'm a little
> worried about that.  Is that still a bug in FreeBSD's make?  It won't
> matter for a pure, virgin tree, but might for incremental builds...

I'm pretty sure that's still true.

Kris


--4Ckj6UjgE2iN1+kY
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (FreeBSD)

iD8DBQE+hjlQWry0BWjoQKURAvSyAJ4/h2XKIIvBigu3+3IKhIC/vCm1AACgvdRH
2fHDR+FDgOiO8yJT6UkEAks=
=As70
-----END PGP SIGNATURE-----

--4Ckj6UjgE2iN1+kY--

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 17:40:48 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6C81437B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 17:40:48 -0800 (PST)
Received: from bluejay.mail.pas.earthlink.net (bluejay.mail.pas.earthlink.net
	[207.217.120.218])
	by mx1.FreeBSD.org (Postfix) with ESMTP id ECA4143F75
	for <arch@freebsd.org>; Sat, 29 Mar 2003 17:40:47 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0277.cvx40-bradley.dialup.earthlink.net ([216.244.43.22]
	helo=mindspring.com)
	by bluejay.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18zRoT-0006uf-00; Sat, 29 Mar 2003 17:40:45 -0800
Message-ID: <3E864AD1.6C1C3656@mindspring.com>
Date: Sat, 29 Mar 2003 17:39:29 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= <des@ofug.org>
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4f14f7a297e07b0d41e6a83a345408d90a2d4e88014a4647c350badd9bab72f9c350badd9bab72f9c
cc: arch@freebsd.org
Subject: Re: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 01:40:50 -0000

Dag-Erling Sm=F8rgrav wrote:
> The attached patch, inspired by a discussion on -STABLE, modifies our
> resolver library to allow underscores in host names, by classifying
> the underscore as a hyphen character.  Even though RFC952 forbids
> them, underscores are becoming increasingly common in DNS, and they
> are sometimes used for mechanisms (such as Microsoft's automatic proxy
> configuration scheme) which we might want to support in FreeBSD.


There was a better patch that made it an option in resolv.conf,
rather than turning it on all the time.

FreeBSD should be standards compliant, by default, and take work
to make it possible to give bogus data to other hosts on the
Internet who can not handle "_" or other characters because they
*are* standars compliant.

"Be conservative in what you send."

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 17:55:16 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id C951137B405
	for <arch@freebsd.org>; Sat, 29 Mar 2003 17:55:16 -0800 (PST)
Received: from puffin.mail.pas.earthlink.net (puffin.mail.pas.earthlink.net
	[207.217.120.139])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 3CA8D43FBF
	for <arch@freebsd.org>; Sat, 29 Mar 2003 17:55:16 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0277.cvx40-bradley.dialup.earthlink.net ([216.244.43.22]
	helo=mindspring.com)
	by puffin.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18zS2N-0007KY-00; Sat, 29 Mar 2003 17:55:08 -0800
Message-ID: <3E864E2F.BA16F6B5@mindspring.com>
Date: Sat, 29 Mar 2003 17:53:51 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Barry Pederson <bp@barryp.org>
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
	<3E8637DE.3080003@barryp.org>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4f0e6d9741b447bd7c23b04c6dac52717a7ce0e8f8d31aa3f350badd9bab72f9c350badd9bab72f9c
cc: arch@freebsd.org
Subject: Re: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 01:55:20 -0000

Barry Pederson wrote:
> > I thought proxy autodetect used wpad.domainname.com or looked up
> > http://domainname.com/wpad.dat ?  All the XP machines here do that.
> 
> The underscore in DNS names is showing up in things like RFC2872 (A DNS RR
> for specifying the location of services), and "DNS-based Service discovery"
> as found in Zeroconf/Rendezvous.

Excuse me, but that *particular* underscore is a namespace
escape, and is used *precisely* so that it does *NOT* ever
match a valid host name.

People who want the resource records for specific services
are supposed to use a service lookup API, rahter than a host
name lookup API.

Please read the working group documentation for Zeroconf.

Thanks,
-- Terry "A big fan of zeroconf and the death of DHCP" Lambert

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 18:06:03 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 11BCC37B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 18:06:03 -0800 (PST)
Received: from whizzo.transsys.com (whizzo.TransSys.COM [144.202.42.10])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 4D98A43FBF
	for <arch@freebsd.org>; Sat, 29 Mar 2003 18:06:02 -0800 (PST)
	(envelope-from louie@whizzo.transsys.com)
Received: from whizzo.transsys.com (#6@localhost [127.0.0.1])
	by whizzo.transsys.com (8.12.8/8.12.7) with ESMTP id h2U25vDN037209;
	Sat, 29 Mar 2003 21:05:57 -0500 (EST)
	(envelope-from louie@whizzo.transsys.com)
Message-Id: <200303300205.h2U25vDN037209@whizzo.transsys.com>
X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4
To: Terry Lambert <tlambert2@mindspring.com>
X-Image-URL: http://www.transsys.com/louie/images/louie-mail.jpg
From: "Louis A. Mamakos" <louie@TransSys.COM>
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
	<3E864AD1.6C1C3656@mindspring.com> 
In-reply-to: Your message of "Sat, 29 Mar 2003 17:39:29 PST."
             <3E864AD1.6C1C3656@mindspring.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable
Date: Sat, 29 Mar 2003 21:05:57 -0500
Sender: louie@TransSys.COM
cc: arch@freebsd.org
cc: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= <des@ofug.org>
Subject: Re: Allow underscores in DNS names 
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 02:06:04 -0000

> Dag-Erling Sm=F8rgrav wrote:
> > The attached patch, inspired by a discussion on -STABLE, modifies our=

> > resolver library to allow underscores in host names, by classifying
> > the underscore as a hyphen character.  Even though RFC952 forbids
> > them, underscores are becoming increasingly common in DNS, and they
> > are sometimes used for mechanisms (such as Microsoft's automatic prox=
y
> > configuration scheme) which we might want to support in FreeBSD.
> =

> =

> There was a better patch that made it an option in resolv.conf,
> rather than turning it on all the time.

This is great, except that you'd don't need to have a resolv.conf
on your system at all; the resolver will default to using a local
caching nameserver.

> FreeBSD should be standards compliant, by default, and take work
> to make it possible to give bogus data to other hosts on the
> Internet who can not handle "_" or other characters because they
> *are* standars compliant.

Since this is a resolver option, you're not handing out names to
other hosts using the DNS infrastructure.

> "Be conservative in what you send."

And liberal in what you receive, which is exactly what modifing
the resolver to not cause gethostbyname() and it's ilk to barf
on these types of names.

There are lots of things in ancient RFCs which probably do not
make as much sense these days as they once did.  If there is a
security issue in applications, they should get fixed regardless.
All this heartburn over what the gethostbyname() library function
chooses to believe from the DNS still doesn't address getting
hostnames out of NIS or /etc/hosts.

louie

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 18:19:47 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id 6E7D437B404
	for <arch@freebsd.org>; Sat, 29 Mar 2003 18:19:47 -0800 (PST)
Received: from stork.mail.pas.earthlink.net (stork.mail.pas.earthlink.net
	[207.217.120.188])
	by mx1.FreeBSD.org (Postfix) with ESMTP id D2CC643F75
	for <arch@freebsd.org>; Sat, 29 Mar 2003 18:19:46 -0800 (PST)
	(envelope-from tlambert2@mindspring.com)
Received: from pool0277.cvx40-bradley.dialup.earthlink.net ([216.244.43.22]
	helo=mindspring.com)
	by stork.mail.pas.earthlink.net with asmtp (SSLv3:RC4-MD5:128)
	(Exim 3.33 #1)	id 18zSQ4-0004KP-00; Sat, 29 Mar 2003 18:19:37 -0800
Message-ID: <3E8653EA.BAF9D765@mindspring.com>
Date: Sat, 29 Mar 2003 18:18:18 -0800
From: Terry Lambert <tlambert2@mindspring.com>
X-Mailer: Mozilla 4.79 [en] (Win98; U)
X-Accept-Language: en
MIME-Version: 1.0
To: "Louis A. Mamakos" <louie@TransSys.COM>
References: <xzpu1dm2k2h.fsf@flood.ping.uio.no>
	<3E864AD1.6C1C3656@mindspring.com>
	<200303300205.h2U25vDN037209@whizzo.transsys.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
X-ELNK-Trace: b1a02af9316fbb217a47c185c03b154d40683398e744b8a4f408a3acb4ae55e09c4b5c7b5d71a30e3ca473d225a0f487350badd9bab72f9c350badd9bab72f9c
cc: arch@freebsd.org
cc: Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?= <des@ofug.org>
Subject: Re: Allow underscores in DNS names
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 02:19:49 -0000

"Louis A. Mamakos" wrote:
> > There was a better patch that made it an option in resolv.conf,
> > rather than turning it on all the time.
> 
> This is great, except that you'd don't need to have a resolv.conf
> on your system at all; the resolver will default to using a local
> caching nameserver.

By this argument, it should do that anyway, if the only option
is this one.

My own argument is that there should be an "allow_chars" option
in the resolv.conf, so that the Tuesday after this is committed,
and someone now wants "#" in domain names to support their idea
of mapping phone numbers to domain names, we don't have to go
through this whole dumb "let's violate RFC-952, just this once!"
argument yet againt.


> > FreeBSD should be standards compliant, by default, and take work
> > to make it possible to give bogus data to other hosts on the
> > Internet who can not handle "_" or other characters because they
> > *are* standars compliant.
> 
> Since this is a resolver option, you're not handing out names to
> other hosts using the DNS infrastructure.

You are if you are a caching DNS server, which uses the resolver
code to look up data on the global DNS, caches it, and returns
it to local DNS querants.

It also permits you to do things like put "_" in names in host
files.


If you *must* have a single patch, at *least* the original original
patch (which *also* failed to provide an option for unbreaking
RFC-952 compliance on the systems of people who prefer to comply
with international standards) only allowed the character *interior*
to the domain names (i.e. after the first character).

That, *at least* hept it from interfering accidently with the
service location resource records for zeroconf.


> > "Be conservative in what you send."
> 
> And liberal in what you receive, which is exactly what modifing
> the resolver to not cause gethostbyname() and it's ilk to barf
> on these types of names.

And liberal in what you resend?

You can't have it both ways.

Reading the 1998 discussion, as was previously suggested, is a
good idea.


> There are lots of things in ancient RFCs which probably do not
> make as much sense these days as they once did.

There is a fix for that: join an IETF group, and create a
"supercedes" RFC.

The standards are the standards, as they are.


> If there is a security issue in applications, they should get
> fixed regardless.

OK.

So you are advocating getting rid of the stupid "This program
uses gets(), which is unsafe" messages, right?

Because the programs where the API that is being used lead to a
security isseu in applications, when people do not know how to
use the API properly.


> All this heartburn over what the gethostbyname() library function
> chooses to believe from the DNS still doesn't address getting
> hostnames out of NIS or /etc/hosts.

NIS and /etc/hosts should *NEVER* contain a host name with an
"_".  *NEVER*.

-- Terry

From owner-freebsd-arch@FreeBSD.ORG  Sat Mar 29 22:12:25 2003
Return-Path: <owner-freebsd-arch@FreeBSD.ORG>
Delivered-To: freebsd-arch@freebsd.org
Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125])
	by hub.freebsd.org (Postfix) with ESMTP id A098737B401
	for <arch@freebsd.org>; Sat, 29 Mar 2003 22:12:25 -0800 (PST)
Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16])
	by mx1.FreeBSD.org (Postfix) with ESMTP id 65AE843F3F
	for <arch@freebsd.org>; Sat, 29 Mar 2003 22:12:24 -0800 (PST)
	(envelope-from bde@zeta.org.au)
Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246])
	by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id QAA00052;
	Sun, 30 Mar 2003 16:12:10 +1000
Date: Sun, 30 Mar 2003 16:12:09 +1000 (EST)
From: Bruce Evans <bde@zeta.org.au>
X-X-Sender: bde@gamplex.bde.org
To: "M. Warner Losh" <imp@bsdimp.com>
In-Reply-To: <20030329.163343.53040416.imp@bsdimp.com>
Message-ID: <20030330150957.M13638@gamplex.bde.org>
References: <20030329.163343.53040416.imp@bsdimp.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
cc: arch@freebsd.org
Subject: Re: depend + all vs dependall
X-BeenThere: freebsd-arch@freebsd.org
X-Mailman-Version: 2.1.1
Precedence: list
List-Id: Discussion related to FreeBSD architecture
	<freebsd-arch.freebsd.org>
List-Unsubscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=unsubscribe>
List-Archive: <http://lists.freebsd.org/pipermail/freebsd-arch>
List-Post: <mailto:freebsd-arch@freebsd.org>
List-Help: <mailto:freebsd-arch-request@freebsd.org?subject=help>
List-Subscribe: <http://lists.freebsd.org/mailman/listinfo/freebsd-arch>,
	<mailto:freebsd-arch-request@freebsd.org?subject=subscribe>
X-List-Received-Date: Sun, 30 Mar 2003 06:12:28 -0000

On Sat, 29 Mar 2003, M. Warner Losh wrote:

> NetBSD created a dependall target some time ago.  This target does a
> make depend and then a make all so they only have to traverse the tree
> once for these two stages rather than twice.  The time of a buildworld
> came up in a discussion recently and I thought I'd see how hard it
> would be to do something similar in FreeBSD.  Here are my preliminary
> results.
>
> Machine: Dell Inspiron 8000, 256M RAM, P3-700
> time make buildworld
> (2:04:34 wall time, didn't save the actual output :-(.
>
> Machine: Dual Athlon XP2000+ 1.5G RAM aac controller.
>
> time make buildworld -j 8 -s

Note that all benchmarks using -j are invalid because of nondeterministic
wait times of up to 100 msec for each job.  This pessimizes makeworld -j 4
times by about 20% on a non-dual Athlon XP1600, and can't do good things
for the variance.  The pessimization is larger on faster machines of
course.  This is fixed in NetBSD.  FreeBSD only has the hackaround of
reducing the timeout from 500 msec to 100 msec.

> run 0: did the above to 'flush the caches/load the sources in ram'
>
> Pre-change:
>
> Run 1:
> 1941.458u 723.640s 32:23.67 137.1%      2747+2215k 1447+145802io 465pf+0w
> Run 2:
> 1942.160u 729.972s 31:45.84 140.2%      2748+2212k 1423+145755io 465pf+0w

The SMP overheads seem to be very large.  I get the following times on a
non-Dual Athlon XP1600 overclocked 256MB RAM ide controller 2 drives:

%%%
--------------------------------------------------------------
>>> elf make world completed on Sun Mar  2 16:30:55 EST 2003
                       (started Sun Mar  2 15:53:15 EST 2003)
--------------------------------------------------------------
     2260.31 real      1729.55 user       326.24 sys
%%%

This machine had lost 256MB of its RAM at the time of the above benchmark
(the latest one that I have with no local changes to the src tree).
Losing 256MB cost it a 100-200 seconds.  Upgrading to 1024 MB RAM
improved on its old speed of 1967 seconds to 1943 seconds (both of
these times with local changes).  The disk cache is cold in all of my
makeworld benchmarks.  A warm cache wouldn't have much helped with
512MB RAM since that is not quite enough to cache the src tree, but
it reduces the makeworld times a little more with 1024 MB RAM.

> After Changes:
>
> Run 1:
> 1922.767u 723.847s 30:48.64 143.1%      2785+2201k 1312+148256io 465pf+0w
> Run 2:
> 1922.661u 725.477s 30:49.99 143.1%      2788+2201k 1378+148489io 465pf+0w
>
> So it looks like it saves a little over a minute out of 32 (1925s
> average vs 1849s average, or almost a 4% reduction) on my big build
> box.

It is a bug for make depend to be run at all in the default (non-NOCLEAN)
case.  My commits for this got clobbered, but I still use them here.
This seems to save only about 5% currently (down from 10% when the
changes were committed in 1998).

> My only concern with the patches is that they might interact badly
> with a bug I remember from the FreeBSD 1.1R days, but can't reproduce,
> in make.  Once upon a time, 'make depend all' was different than 'make
> depend && make all' because the .depend files weren't re-read after
> the depend phase, but before the all phase, whereas two makes this
> would be the case.  Since this change combines the two, I'm a little
> worried about that.  Is that still a bug in FreeBSD's make?  It won't
> matter for a pure, virgin tree, but might for incremental builds...

This is not a bug, but is how make works.  It shouldn't be a problem if
dependall is implemented correctly.  dependall should avoid the double
tree traversal but somehow build "depend" and "all" sequentially in
leaf directories.

> ==== //depot/user/imp/freebsd-imp/share/mk/bsd.README#3 (text+ko) ====
>
> @@ -169,6 +169,8 @@
>  	depend:
>  		make the dependencies for the source files, and store
>  		them in the file .depend.
> +	dependall:
> +		make depend then make all

This at least describes a correct implementation :-).

>  	install:
>  		install the program and its manual pages; if the Makefile
>  		does not itself define the target install, the targets
>
> ==== //depot/user/imp/freebsd-imp/share/mk/bsd.dep.mk#3 (text+ko) ====
>
> @@ -31,6 +31,9 @@
>  #		Make the dependencies for the source files, and store
>  #		them in the file ${DEPENDFILE}.
>  #
> +#	dependall:
> +#		make depend and then all
> +#

The wording is different from that in the README, and is poor in both
places.

>  #	tags:
>  #		In "ctags" mode, create a tags file for the source files.
>  #		In "gtags" mode, create a (GLOBAL) gtags file for the
> @@ -183,3 +186,7 @@
>  		echo "LDADD -> $$ldadd1" ; \
>  	fi
>  .endif
> +
> +.PHONY: dependall
> +.ORDER: afterdepend all
> +dependall: depend all

.PHONY doesn't work right with BSD make, and is not use for any of the
other phony depend targets in FreeBSD.

The dependencies seem to be correct, but I think it's a style bug to
have afterdepend in the .ORDER statement instead of "depend", at least
in FreeBSD.  afterdepend isn't actually done after "depend"; "depend"
depends on afterdepend so the latter is part of the former (this is
another style bug).

Bruce