From owner-freebsd-arch  Sun Apr 29  0:58: 2 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8438E37B422; Sun, 29 Apr 2001 00:58:00 -0700 (PDT)
	(envelope-from phk@critter.freebsd.dk)
Received: from critter (localhost [127.0.0.1])
	by critter.freebsd.dk (8.11.3/8.11.3) with ESMTP id f3T7vjU35285;
	Sun, 29 Apr 2001 09:57:46 +0200 (CEST)
	(envelope-from phk@critter.freebsd.dk)
To: scanner@jurai.net
Cc: Robert Watson <rwatson@FreeBSD.ORG>, freebsd-arch@FreeBSD.ORG
Subject: Re: jailNG 
In-Reply-To: Your message of "Sat, 28 Apr 2001 19:49:59 EDT."
             <Pine.BSF.4.21.0104281944550.84976-100000@sasami.jurai.net> 
Date: Sun, 29 Apr 2001 09:57:45 +0200
Message-ID: <35283.988531065@critter>
From: Poul-Henning Kamp <phk@critter.freebsd.dk>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

In message <Pine.BSF.4.21.0104281944550.84976-100000@sasami.jurai.net>, scanner
@jurai.net writes:
>
>It is my understanding from the OpenRoot project that jail currently does
>not allow ICMP to work inside a jail? If this is so, this seriously
>damages services that need Path MTU-D such as SMTP and HTTP. Surely this
>is not the case? Can someone enlighten me on this.

ICMP works just as usual, but programs which use RAW-IP sockets,
such as the ping(8) program does not.

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
phk@FreeBSD.ORG         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Apr 29 18:38:50 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205])
	by hub.freebsd.org (Postfix) with SMTP id B5F3937B42C
	for <arch@FreeBSD.org>; Sun, 29 Apr 2001 18:38:46 -0700 (PDT)
	(envelope-from jon@licq.org)
Received: (qmail 47032 invoked from network); 30 Apr 2001 01:38:46 -0000
Received: from unknown (HELO dsl254-020-111.sea1.dsl.speakeasy.net) ([216.254.20.111]) (envelope-sender <jon@licq.org>)
          by mail5.speakeasy.net (qmail-ldap-1.03) with SMTP
          for <rwatson@FreeBSD.org>; 30 Apr 2001 01:38:46 -0000
Content-Type: text/plain;
  charset="iso-8859-1"
From: Jon Keating <jon@licq.org>
To: Robert Watson <rwatson@FreeBSD.org>
Subject: Re: suser security
Date: Sun, 29 Apr 2001 20:38:44 -0500
X-Mailer: KMail [version 1.2]
Cc: tmm@FreeBSD.org, Eivind Eklund <eivind@FreeBSD.org>,
	arch@FreeBSD.org, trustedbsd-discuss@TrustedBSD.org
References: <Pine.NEB.3.96L.1010411222312.97516F-100000@fledge.watson.org>
In-Reply-To: <Pine.NEB.3.96L.1010411222312.97516F-100000@fledge.watson.org>
MIME-Version: 1.0
Message-Id: <01042920384400.01077@dsl254-020-111.sea1.dsl.speakeasy.net>
Content-Transfer-Encoding: 8bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Robert-

Now it's my time to be sorry for the delay.  Being sick when a lot of school 
projects and stuff going on at work really makes me just wanna go straight to 
bed as soon as I get home.  After 8 weeks of being sick, I think I'm better;  
no sore throat today.

I have a simple question about POSIX.1e  I saw the link to download the 
withdrawn draft.  Is there actually a standard from IEEE that was accepted 
regarding this?

[snip]

>    I suspect that Thomas has caught most of the remaining uid calls,
>    but once he publishes the next version of the capability patch, we'd
>    welcome your work on it to catch remaining instances, and to comment
>    on correctness.

I'm assuming this will be posted on the TrustedBSD mailing list, correct?

>    Andrew Reiter has been preparing an auditing subsystem requirements
>    document, as well as descriptions of implementations on other platforms
>    so that we can go through a more informed design process.  Having
>    partially implemented auditing on FreeBSD twice in the past, I can
>    comment both on the complexity of correct implemetation, and the need
>    to be sensitive to issues of simplicity, maintainability, and
>    performance.  Taking a fairly deep look into other implementations will
>    be important to developing an audit implementation that can be accepted
>    by the FreeBSD community.  This is certainly an area where both your
>    contributions in the form of ideas and implementation would be most
>    welcome.

Yes, I would be very interesting in assisting with this as soon as the 
framework is set up to build upon.  Do you know of any time-frame on this 
though?  Summer is coming up in a few weeks for me and I wanted to get going 
on a personal project and working on FreeBSD, but TrustedBSD seems to be the 
area that I should focus on by reviewing the documentation I've seen about 
it.  I look forward to contributing to this, and would enjoy starting on 
anything that you guys need help with.  Well, starting in two weeks from now. 
 Just let me know how I can contribute most effectively at this moment.

Jon

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sun Apr 29 21:44:33 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from fledge.watson.org (fledge.watson.org [204.156.12.50])
	by hub.freebsd.org (Postfix) with ESMTP
	id 8747837B422; Sun, 29 Apr 2001 21:44:29 -0700 (PDT)
	(envelope-from robert@fledge.watson.org)
Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3])
	by fledge.watson.org (8.11.3/8.11.3) with SMTP id f3U4Zaf23859;
	Mon, 30 Apr 2001 00:35:36 -0400 (EDT)
	(envelope-from robert@fledge.watson.org)
Date: Mon, 30 Apr 2001 00:35:36 -0400 (EDT)
From: Robert Watson <rwatson@FreeBSD.org>
X-Sender: robert@fledge.watson.org
To: Jon Keating <jon@licq.org>
Cc: tmm@FreeBSD.org, Eivind Eklund <eivind@FreeBSD.org>,
	arch@FreeBSD.org, trustedbsd-discuss@TrustedBSD.org
Subject: Re: suser security
In-Reply-To: <01042920384400.01077@dsl254-020-111.sea1.dsl.speakeasy.net>
Message-ID: <Pine.NEB.3.96L.1010430002439.23768B-100000@fledge.watson.org>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


On Sun, 29 Apr 2001, Jon Keating wrote:

> I have a simple question about POSIX.1e I saw the link to download the
> withdrawn draft.  Is there actually a standard from IEEE that was
> accepted regarding this?

Nope -- draft 17 was the final draft, and after that apparently the
committee decided not to go ahead with it.  Some sections of the draft are
quite useful, and other parts less so.

> >    I suspect that Thomas has caught most of the remaining uid calls,
> >    but once he publishes the next version of the capability patch, we'd
> >    welcome your work on it to catch remaining instances, and to comment
> >    on correctness.
> 
> I'm assuming this will be posted on the TrustedBSD mailing list, correct?

Generally speaking, major patchsets and commits are announced on the
discussion list, with occasional CC's to appropriate FreeBSD lists.  So
far, we've not had the opportunity to use the -announce list much, but
probably should so so more (such as the ACL stuff recently).

> >    Andrew Reiter has been preparing an auditing subsystem requirements
> >    document, as well as descriptions of implementations on other platforms
> >    so that we can go through a more informed design process.  Having
> >    partially implemented auditing on FreeBSD twice in the past, I can
> >    comment both on the complexity of correct implemetation, and the need
> >    to be sensitive to issues of simplicity, maintainability, and
> >    performance.  Taking a fairly deep look into other implementations will
> >    be important to developing an audit implementation that can be accepted
> >    by the FreeBSD community.  This is certainly an area where both your
> >    contributions in the form of ideas and implementation would be most
> >    welcome.
> 
> Yes, I would be very interesting in assisting with this as soon as the
> framework is set up to build upon.  Do you know of any time-frame on
> this though?  Summer is coming up in a few weeks for me and I wanted to
> get going on a personal project and working on FreeBSD, but TrustedBSD
> seems to be the area that I should focus on by reviewing the
> documentation I've seen about it.  I look forward to contributing to
> this, and would enjoy starting on anything that you guys need help with. 
> Well, starting in two weeks from now.
>  Just let me know how I can contribute most effectively at this moment.

Your help would be much appreciated -- there's always more work to do :-). 
Propelling the auditing work forward some more to get an initial design
done would be useful, as would work to help Chris get userland
applications adapted to use ACLs (he recently committed changes to cp and
mv, but there are plenty left).  As I mentioned before, once the
capabilities kernel code gets a little bit more mature, there will be lots
of work necessary to understand its impact on the userland codebase;
Thomas should be able to provide us with some direction on what work to
look at there.  And now that I've finished up my USENIX paper, I hope to
forge ahead more on the MAC code--one thing I haven't had time to look at,
but would be nice to do, is to implement Type Enforcement (TE) using the
MAC framework.  I don't know how up you are on various access control
technologies, but TE is the policy mechanism used in SELinux, and
developed in part at Secure Computing. 

Robert N M Watson             FreeBSD Core Team, TrustedBSD Project
robert@fledge.watson.org      NAI Labs, Safeport Network Services


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Apr 30 10:24:21 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67])
	by hub.freebsd.org (Postfix) with ESMTP id 6194837B422
	for <Arch@FreeBSD.ORG>; Mon, 30 Apr 2001 10:24:19 -0700 (PDT)
	(envelope-from dillon@earth.backplane.com)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.2/8.11.2) id f3UHNIc36491;
	Mon, 30 Apr 2001 10:23:18 -0700 (PDT)
	(envelope-from dillon)
Date: Mon, 30 Apr 2001 10:23:18 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200104301723.f3UHNIc36491@earth.backplane.com>
To: Nate Williams <nate@yogotech.com>
Cc: Alfred Perlstein <bright@wintelcom.net>,
	Daniel Eischen <eischen@vigrid.com>,
	Nate Williams <nate@yogotech.com>,
	Julian Elischer <julian@elischer.org>, Arch@FreeBSD.ORG
Subject: Re: KSE threading support (first parts)
References: <15081.50170.297579.938254@nomad.yogotech.com>
	<Pine.SUN.3.91.1010427154434.12501B-100000@pcnet1.pcnet.com>
	<20010427130826.G18676@fw.wintelcom.net> <15081.53821.755743.746621@nomad.yogotech.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:>    Being able to have threads used in a "this application wants to
:>    utilize _all_ available system reasources" meaning if you have
:>    more than one processor, I want to see mysql, apache, whatever
:>    using it (by default!).  If your model doesn't include this then
:>    please don't bother continuing, the stability issues versus the
:>    gain don't work for me at all.
:
:Having 'serialized' KSE's (which Matt wants) means that an application
:will be *UNABLE* to use all of the system resources, because only one
:thread in threaded application (apache, mysql, etc..) is allowed to run
:at one time, no matter how many CPU's are there.
:
:
:Nate

    Nonsense.  It means no such thing.  You missed the whole point about
    using rfork().  One application, multiple threads, MULTPLE PROCESSES
    (using rfork() supporting those threads, and the kernel contexes
    for ANY GIVEN PROCESS are serialized, *NOT* the kernel contexts across
    all the N processes.

    That is the model we discussed at Yahoo and at at least three BAFUG
    meetings.  That is the model that best fits the current source base.

    Somebody somewhere started mangling the model into something that
    sounds great on paper and in theory, but is going to be god aweful hell to
    implement -- by trying to run multiple KSE's belonging to ONE 
    process CONCURRENTLY, you now have to lock so much shit in the kernel
    that was previously contextual that you will get an across-the-board
    performance drop no matter what.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Apr 30 10:52:14 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from crewsoft.com (ns.aenet.net [157.22.214.1])
	by hub.freebsd.org (Postfix) with ESMTP id 7367D37B422
	for <Arch@freebsd.org>; Mon, 30 Apr 2001 10:52:11 -0700 (PDT)
	(envelope-from cedric@wireless-networks.com)
Received: from [63.197.8.222] (account cberger@wireless-networks.com HELO wireless-networks.com)
  by crewsoft.com (CommuniGate Pro SMTP 3.4.2)
  with ESMTP id 618673; Mon, 30 Apr 2001 10:55:21 -0700
Message-ID: <3AEDA66D.B51B4267@wireless-networks.com>
Date: Mon, 30 Apr 2001 10:52:45 -0700
From: Cedric Berger <cedric@wireless-networks.com>
X-Mailer: Mozilla 4.76 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Matt Dillon <dillon@earth.backplane.com>
Cc: Arch@FreeBSD.ORG
Subject: Re: KSE threading support (first parts)
References: <15081.50170.297579.938254@nomad.yogotech.com>
		<Pine.SUN.3.91.1010427154434.12501B-100000@pcnet1.pcnet.com>
		<20010427130826.G18676@fw.wintelcom.net> <15081.53821.755743.746621@nomad.yogotech.com> <200104301723.f3UHNIc36491@earth.backplane.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Sorry is I ask a stupid question, but I'm trying to make sense of this thread.
Here is my question:
 - I've a server that runs a single big java application server (one process,
   tons of threads like every Java app)
 - With the new KSE and friends architecture, will I be able to scale
   my app by adding CPUs?

Thanks,
Cedric


Matt Dillon wrote:

> :>    Being able to have threads used in a "this application wants to
> :>    utilize _all_ available system reasources" meaning if you have
> :>    more than one processor, I want to see mysql, apache, whatever
> :>    using it (by default!).  If your model doesn't include this then
> :>    please don't bother continuing, the stability issues versus the
> :>    gain don't work for me at all.
> :
> :Having 'serialized' KSE's (which Matt wants) means that an application
> :will be *UNABLE* to use all of the system resources, because only one
> :thread in threaded application (apache, mysql, etc..) is allowed to run
> :at one time, no matter how many CPU's are there.
> :
> :
> :Nate
>
>     Nonsense.  It means no such thing.  You missed the whole point about
>     using rfork().  One application, multiple threads, MULTPLE PROCESSES
>     (using rfork() supporting those threads, and the kernel contexes
>     for ANY GIVEN PROCESS are serialized, *NOT* the kernel contexts across
>     all the N processes.
>
>     That is the model we discussed at Yahoo and at at least three BAFUG
>     meetings.  That is the model that best fits the current source base.
>
>     Somebody somewhere started mangling the model into something that
>     sounds great on paper and in theory, but is going to be god aweful hell to
>     implement -- by trying to run multiple KSE's belonging to ONE
>     process CONCURRENTLY, you now have to lock so much shit in the kernel
>     that was previously contextual that you will get an across-the-board
>     performance drop no matter what.
>
>                                                 -Matt
>
> To Unsubscribe: send mail to majordomo@FreeBSD.org
> with "unsubscribe freebsd-arch" in the body of the message


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Mon Apr 30 18: 6:50 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from gatekeeper.tsc.tdk.com (gatekeeper.tsc.tdk.com [207.113.159.21])
	by hub.freebsd.org (Postfix) with ESMTP
	id E1C1637B422; Mon, 30 Apr 2001 18:06:45 -0700 (PDT)
	(envelope-from gdonl@tsc.tdk.com)
Received: from imap.gv.tsc.tdk.com (imap.gv.tsc.tdk.com [192.168.241.198])
	by gatekeeper.tsc.tdk.com (8.8.8/8.8.8) with ESMTP id SAA20725;
	Mon, 30 Apr 2001 18:06:45 -0700 (PDT)
	(envelope-from gdonl@tsc.tdk.com)
Received: from salsa.gv.tsc.tdk.com (salsa.gv.tsc.tdk.com [192.168.241.194])
	by imap.gv.tsc.tdk.com (8.9.3/8.9.3) with ESMTP id SAA92657;
	Mon, 30 Apr 2001 18:06:45 -0700 (PDT)
	(envelope-from Don.Lewis@tsc.tdk.com)
Received: (from gdonl@localhost)
	by salsa.gv.tsc.tdk.com (8.8.5/8.8.5) id SAA08105;
	Mon, 30 Apr 2001 18:06:44 -0700 (PDT)
From: Don Lewis <Don.Lewis@tsc.tdk.com>
Message-Id: <200105010106.SAA08105@salsa.gv.tsc.tdk.com>
Date: Mon, 30 Apr 2001 18:06:44 -0700
In-Reply-To: <Pine.NEB.3.96L.1010411235010.97720B-100000@fledge.watson.org>
References:  <Pine.NEB.3.96L.1010411235010.97720B-100000@fledge.watson.org>
X-Mailer: Mail User's Shell (7.2.6 beta(5) 10/07/98)
To: Robert Watson <rwatson@FreeBSD.ORG>, freebsd-arch@FreeBSD.ORG
Subject: Re: Combining pcred and ucred
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Apr 12, 12:03am, Robert Watson wrote:
} Subject: Combining pcred and ucred

} The observation may be made that although one possible rationale for the
} division of ucred and pcred is that pcred might get modified while ucred
} remains constant, reducing the number of copy-on-writes, in fact ucred is
} almost always (if not always) modified when the pcred needs to be
} modified.  There is a small space savings associated with not caching the
} real and saved uid/gid's all over the place, but this is negligible due to
} the power-of-two memory allocation model, overhead of maintaining
} additional refcounts, pointers, and uidinfo references, and the
} copy-on-write nature of ucred.  In fact, the pcred pointer substantially
} complicates proc locking, and increases the cost of credential evaluation
} and manipulation by introducing additional structures and dreferences.  I
} think there is a reasonable argument to be made that pcred should be
} merged back into ucred, making ucred the sole source of credential
} information for the process.  In fact, I'd like to propose doing this, as
} it allows the more extensive use of direct ucred to ucred comparison in
} access control, rather than proces to process (hence the proc locking
} issue).

I've felt for a long time that pcred and ucred should be merged.  I
can't think of a good reason for them to have been separated in the
first place.  I suspect that they were each factored out of the old
proc and user structures, though I would have expected to that only
the effective uid would have migrated from the proc structure to the
pcred structure so that it would be available if if the process and
its user structure were swapped.  I guess I'll have to dig through
my library to satisfy my curiousity now.

} There is one additional piece of information, that I know of, that should
} probably move into ucred if pcred moves into ucred: this is the P_SUGID
} flag, which is currently in the process flag field, bit might be more
} formally considered a credential downgrade or credential modification
} flag.  Moving this flag into ucred would allow the flag to be available
} for access control operations based on the ucred, which would also have
} substantial structural benefits, albeit at the slight increase in cost
} associated with ucred caching.

I've suggested the same thing in the past.

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue May  1  6:23:32 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from filk.iinet.net.au (syncopation-dns.iinet.net.au [203.59.24.29])
	by hub.freebsd.org (Postfix) with SMTP id C6F0937B422
	for <Arch@FreeBSD.ORG>; Tue,  1 May 2001 06:23:25 -0700 (PDT)
	(envelope-from julian@elischer.org)
Received: (qmail 24799 invoked by uid 666); 1 May 2001 13:26:47 -0000
Received: from i185-211.nv.iinet.net.au (HELO elischer.org) (203.59.185.211)
  by mail.m.iinet.net.au with SMTP; 1 May 2001 13:26:47 -0000
Message-ID: <3AEEB887.353B3CD@elischer.org>
Date: Tue, 01 May 2001 06:22:15 -0700
From: Julian Elischer <julian@elischer.org>
X-Mailer: Mozilla 4.7 [en] (X11; U; FreeBSD 5.0-CURRENT i386)
X-Accept-Language: en, hu
MIME-Version: 1.0
To: Cedric Berger <cedric@wireless-networks.com>
Cc: Matt Dillon <dillon@earth.backplane.com>, Arch@FreeBSD.ORG
Subject: Re: KSE threading support (first parts)
References: <15081.50170.297579.938254@nomad.yogotech.com>
			<Pine.SUN.3.91.1010427154434.12501B-100000@pcnet1.pcnet.com>
			<20010427130826.G18676@fw.wintelcom.net> <15081.53821.755743.746621@nomad.yogotech.com> <200104301723.f3UHNIc36491@earth.backplane.com> <3AEDA66D.B51B4267@wireless-networks.com>
Content-Type: text/plain; charset=iso-8859-2
Content-Transfer-Encoding: 7bit
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Cedric Berger wrote:
> 
> Sorry is I ask a stupid question, but I'm trying to make sense of this thread.
> Here is my question:
>  - I've a server that runs a single big java application server (one process,
>    tons of threads like every Java app)
>  - With the new KSE and friends architecture, will I be able to scale
>    my app by adding CPUs?

Using the linuxthreads port you can do that today
using Pthreads today you can not.
using the KSE scheme you can


-- 
      __--_|\  Julian Elischer
     /       \ julian@elischer.org
    (   OZ    ) World tour 2000-2001
---> X_.---._/  
            v

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue May  1  9:14:34 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from ns.yogotech.com (ns.yogotech.com [206.127.123.66])
	by hub.freebsd.org (Postfix) with ESMTP id 3520637B424
	for <Arch@FreeBSD.ORG>; Tue,  1 May 2001 09:14:29 -0700 (PDT)
	(envelope-from nate@yogotech.com)
Received: from nomad.yogotech.com (yogotech.nokia.com [4.22.66.156])
	by ns.yogotech.com (8.9.3/8.9.3) with ESMTP id KAA21956;
	Tue, 1 May 2001 10:14:16 -0600 (MDT)
	(envelope-from nate@nomad.yogotech.com)
Received: (from nate@localhost)
	by nomad.yogotech.com (8.8.8/8.8.8) id KAA04466;
	Tue, 1 May 2001 10:14:11 -0600 (MDT)
	(envelope-from nate)
From: Nate Williams <nate@yogotech.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <15086.57554.673831.601763@nomad.yogotech.com>
Date: Tue, 1 May 2001 10:14:10 -0600 (MDT)
To: Julian Elischer <julian@elischer.org>
Cc: Cedric Berger <cedric@wireless-networks.com>,
	Matt Dillon <dillon@earth.backplane.com>, Arch@FreeBSD.ORG
Subject: Re: KSE threading support (first parts)
In-Reply-To: <3AEEB887.353B3CD@elischer.org>
References: <15081.50170.297579.938254@nomad.yogotech.com>
	<Pine.SUN.3.91.1010427154434.12501B-100000@pcnet1.pcnet.com>
	<20010427130826.G18676@fw.wintelcom.net>
	<15081.53821.755743.746621@nomad.yogotech.com>
	<200104301723.f3UHNIc36491@earth.backplane.com>
	<3AEDA66D.B51B4267@wireless-networks.com>
	<3AEEB887.353B3CD@elischer.org>
X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid
Reply-To: nate@yogotech.com (Nate Williams)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > Sorry is I ask a stupid question, but I'm trying to make sense of this thread.
> > Here is my question:
> >  - I've a server that runs a single big java application server (one process,
> >    tons of threads like every Java app)
> >  - With the new KSE and friends architecture, will I be able to scale
> >    my app by adding CPUs?
> 
> Using the linuxthreads port you can do that today
> using Pthreads today you can not.
> using the KSE scheme you can

Only if the JVM is compiled using the above technologies.  Linuxthreads
won't be used because the license is incompatible with the JDK license.


Nate

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue May  1  9:37:50 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3])
	by hub.freebsd.org (Postfix) with ESMTP id DA21837B43C
	for <Arch@FreeBSD.ORG>; Tue,  1 May 2001 09:37:46 -0700 (PDT)
	(envelope-from eischen@vigrid.com)
Received: (from eischen@localhost)
	by pcnet1.pcnet.com (8.8.7/PCNet) id MAA07027;
	Tue, 1 May 2001 12:37:02 -0400 (EDT)
Date: Tue, 1 May 2001 12:37:01 -0400 (EDT)
From: Daniel Eischen <eischen@vigrid.com>
To: Nate Williams <nate@yogotech.com>
Cc: Julian Elischer <julian@elischer.org>,
	Cedric Berger <cedric@wireless-networks.com>, Arch@FreeBSD.ORG
Subject: Re: KSE threading support (first parts)
In-Reply-To: <15086.57554.673831.601763@nomad.yogotech.com>
Message-ID: <Pine.SUN.3.91.1010501122914.5556A-100000@pcnet1.pcnet.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

On Tue, 1 May 2001, Nate Williams wrote:
> > > Sorry is I ask a stupid question, but I'm trying to make sense of this thread.
> > > Here is my question:
> > >  - I've a server that runs a single big java application server (one process,
> > >    tons of threads like every Java app)
> > >  - With the new KSE and friends architecture, will I be able to scale
> > >    my app by adding CPUs?
> > 
> > Using the linuxthreads port you can do that today
> > using Pthreads today you can not.
> > using the KSE scheme you can
> 
> Only if the JVM is compiled using the above technologies.  Linuxthreads
> won't be used because the license is incompatible with the JDK license.

And when we do get KSEs, what's the optimal way to map Java threads
to KSEGs/KSEs?  Should each Java thread be a PTHREAD_SCOPE_SYSTEM
thread where each thread gets its own KSEG/KSE pair, or would it
be better to run all threads as PTHREAD_SCOPE_PROCESS in one KSE/KSEG
pair?

-- 
Dan Eischen

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Tue May  1  9:39:44 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from ns.yogotech.com (ns.yogotech.com [206.127.123.66])
	by hub.freebsd.org (Postfix) with ESMTP id 7485E37B43C
	for <Arch@FreeBSD.ORG>; Tue,  1 May 2001 09:39:41 -0700 (PDT)
	(envelope-from nate@yogotech.com)
Received: from nomad.yogotech.com (yogotech.nokia.com [4.22.66.156])
	by ns.yogotech.com (8.9.3/8.9.3) with ESMTP id KAA22360;
	Tue, 1 May 2001 10:39:16 -0600 (MDT)
	(envelope-from nate@nomad.yogotech.com)
Received: (from nate@localhost)
	by nomad.yogotech.com (8.8.8/8.8.8) id KAA04588;
	Tue, 1 May 2001 10:39:11 -0600 (MDT)
	(envelope-from nate)
From: Nate Williams <nate@yogotech.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <15086.59054.668961.132528@nomad.yogotech.com>
Date: Tue, 1 May 2001 10:39:10 -0600 (MDT)
To: Daniel Eischen <eischen@vigrid.com>
Cc: Nate Williams <nate@yogotech.com>,
	Julian Elischer <julian@elischer.org>,
	Cedric Berger <cedric@wireless-networks.com>, Arch@FreeBSD.ORG
Subject: Re: KSE threading support (first parts)
In-Reply-To: <Pine.SUN.3.91.1010501122914.5556A-100000@pcnet1.pcnet.com>
References: <15086.57554.673831.601763@nomad.yogotech.com>
	<Pine.SUN.3.91.1010501122914.5556A-100000@pcnet1.pcnet.com>
X-Mailer: VM 6.75 under 21.1 (patch 12) "Channel Islands" XEmacs Lucid
Reply-To: nate@yogotech.com (Nate Williams)
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

> > > > Sorry is I ask a stupid question, but I'm trying to make sense of this thread.
> > > > Here is my question:
> > > >  - I've a server that runs a single big java application server (one process,
> > > >    tons of threads like every Java app)
> > > >  - With the new KSE and friends architecture, will I be able to scale
> > > >    my app by adding CPUs?
> > > 
> > > Using the linuxthreads port you can do that today
> > > using Pthreads today you can not.
> > > using the KSE scheme you can
> > 
> > Only if the JVM is compiled using the above technologies.  Linuxthreads
> > won't be used because the license is incompatible with the JDK license.
> 
> And when we do get KSEs, what's the optimal way to map Java threads
> to KSEGs/KSEs?  Should each Java thread be a PTHREAD_SCOPE_SYSTEM
> thread where each thread gets its own KSEG/KSE pair, or would it
> be better to run all threads as PTHREAD_SCOPE_PROCESS in one KSE/KSEG
> pair?

I believe the latter, but tests would show us which works better for a
mix.


Nate

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Wed May  2 19:57: 4 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from bsdhome.dyndns.org (unknown [24.25.2.193])
	by hub.freebsd.org (Postfix) with ESMTP id 4A90D37B422
	for <freebsd-arch@freebsd.org>; Wed,  2 May 2001 19:56:59 -0700 (PDT)
	(envelope-from bsd@bsdhome.com)
Received: from vger.bsdhome.com (vger [192.168.220.2])
	by bsdhome.dyndns.org (8.11.3/8.11.3) with ESMTP id f432uvJ84822
	for <freebsd-arch@freebsd.org>; Wed, 2 May 2001 22:56:58 -0400 (EDT)
	(envelope-from bsd@bsdhome.com)
Received: (from bsd@localhost)
	by vger.bsdhome.com (8.11.3/8.11.1) id f432uu012436;
	Wed, 2 May 2001 22:56:56 -0400 (EDT)
	(envelope-from bsd)
Date: Wed, 2 May 2001 22:56:56 -0400
From: Brian Dean <bsd@bsdhome.com>
To: freebsd-arch@freebsd.org
Subject: rc.diskless* patches
Message-ID: <20010502225656.A1173@vger.bsdhome.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

Hi,

I've put together some patches to the diskless startup code that I'd
like to commit.  I've made both -stable and -current versions of the
patches.  I've tested the -stable patches, but I have not tested the
-current patches, hopefully someone can do that and get back to me.
My -current environment is not working at the moment.

The patches do three things:

	1) Reduce diffs between the -current and -stable versions of
	   these files to a bare minimum.  Only the definition of the
	   shell function 'mount_md' is different.

	2) Provide the ability to make /tmp a memory filesystem
	   independent of /var.  This removes the requirement that
	   /tmp be a symlink to /var/tmp and this makes the diskless
	   code work with the default filesystem layout.

	3) Simplify the population of the /etc memory filesystem.  To
	   avoid the null mount, we currently create a temporary mfs
	   on /tmp, copy /etc to /tmp, then mount /etc as mfs and copy
	   everything back from /tmp, then delete the /tmp mfs.

	   The patch eliminates the /tmp mfs and the subsequent
	   copying and simply populates the /etc mfs by copying from
	   /conf/default/etc.  This requires that /conf/default/etc
	   contain a complete copy of all the /etc stuff instead of
	   just overrides.  I don't think that is too much of an extra
	   step in setting up a diskless environment.

My patches are at:

	http://people.freebsd.org/~bsd/diskless

Any comments are appreciated.

I wasn't sure of the best place to post for comments.  Some of this
was discussed on -stable.  I've seen the diskless code discussed on
-small also.  Instead of posting to -current, -stable, and -small, I'm
just posting to -arch, which seems appropriate.  If anyone feels that
it should get greater coverage, please feel free to forward
appropriately.

Thanks,
-Brian
-- 
Brian Dean
bsd@FreeBSD.org
bsd@bsdhome.com

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Thu May  3 16:56: 5 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from technokratis.com (modemcable049.41-203-24.mtl.mc.videotron.ca [24.203.41.49])
	by hub.freebsd.org (Postfix) with ESMTP id AE50337B424
	for <freebsd-arch@freebsd.org>; Thu,  3 May 2001 16:55:31 -0700 (PDT)
	(envelope-from bmilekic@technokratis.com)
Received: (from bmilekic@localhost)
	by technokratis.com (8.11.3/8.11.3) id f43Nx4Z53585
	for freebsd-arch@freebsd.org; Thu, 3 May 2001 19:59:04 -0400 (EDT)
	(envelope-from bmilekic)
Date: Thu, 3 May 2001 19:59:04 -0400
From: Bosko Milekic <bmilekic@technokratis.com>
To: freebsd-arch@freebsd.org
Subject: Mbuf slab [new allocator]
Message-ID: <20010503195904.A53281@technokratis.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


Hello,

  Anyone interested in the mbuf subsystem code should probably read this.
Others may still read it, but it is somewhat longer than your average Email,
so consider this a warning. :-)
  Also, although I tried my best to cover most issues here, feel free to let
me know if I should clarify some points.

  Not so long ago, as I'm sure some of you remember, Alfred committed a patch
to the mbuf code that collapsed what then were three seperate mutex locks -
one for the mbuf free list, one for the cluster free list, and one for the
m_ext counter free list - into one single mbuf lock, locking all three
free lists. After an argument with Alfred, I caved (as I sometimes do) and
agreed with him on the assertion that overall, this would be a good move
seeing as how we'd probably move to a slab allocator for the mbuf subsystem and
that when we do, it will be more advantageous to have one lock for all three
object types in order to reduce too much cache invalidation ping-ponging from
one CPU to the other.
  Earlier than that, even, Alfred had written what I think presently needs
work to run (I'm not sure if it compiles in the state that it presently is
in), what he calls the "memcache allocator." He put the code here:
	http://people.freebsd.org/~alfred/memcache/

  After reading the "memcache" code, I was generally happy with it, and Alfred
has provided documentation backing up overall performance of the slab allocator
[that is, not necessarily the performance of `memcache' itself, but rather
the concepts that memcache was built on] and after thinking about the whole
thing for a while agreed with Alfred that a slab allocator would be a worthy
thing to have for mbuf and mcluster allocations. However, there were some things
that clearly needed to be done differently from memcache, and I wanted to
keep the flexibility of being able to micro-optimize for the mbuf-related
allocations in the future. Also, mbuf-related allocations work differently
than `regular' allocations in that the status quo is such that we do not
free VM pages back to the map that we allocate them from. So once we allocate
a page, we take the address space from the map, wire it down, and never
return the address space. This saves us from having to free back to VM when
freeing mbufs - a relatively expensive task - and allows us to keep mbufs
and mclusters `cached' on free lists, from which we can quickly allocate.
  Sure, memcache provides the option ensuring that we'll also keep a certain
(tunable) number of objects cached on free lists, but when memcache cannot
find a page and when it needs to allocate one, it relied entirely on the VM
code to do the blocking for it, until a page would be freed to the map, or
until it could find a page (depending on the nature of the exhaustion).
  For the mbuf allocator, we needed to deal with a third type of exhaustion.
With the mbuf allocations, you have a finite amount of address space which,
as the number of allocations grows, shrinks. So, when you see that you have
no more mbufs or mclusters to allocate, you have to deal with the problem
in one of two ways:

	1 - either you have address space left in your map, in which case you're
	    entitled to a page and in which case you may very well block in the
	    VM code until you get it.

	or

	2 - you have already allocated all the pages that you are entitled to
	    (i.e. no more address space) and so your only option is to
	    drain the protocols and block waiting for a consumed object (mbuf
	    or cluster) to become available. This blocking is done in the mbuf
	    code and is implemented through a condition variable.

  So clearly, these two conditions needed to be maintained. Furthermore, I
wanted to at the same time leave the possibility to be able to implement
some sort of `lazy freeing' of pages that would be done periodically in the
spirit of your classic garbage collector. In short, I wanted memcache with
several twists and turns, as is usually required for the mbuf code.

  So, thanks to Alfred's patience in answering some questions I had, I decided
to write an mbuf slab-like allocator. I put a relatively early version of the
code here:

	http://people.freebsd.org/~bmilekic/code/mb_slab/

  If you're still with me, a few mentions regarding the code:

	- I split out the allocator code into kern_mbuf.c, and left the mbuf
	  utility routines in uipc_mbuf.c

	- mbuf.h macros are a lot cleaner/smaller, I plan to make them even
	  more smaller eventually.

	- Matt Dillon had some really neat suggestions on implementing real
	  fast allocations from the PCPU lists that the allocator maintains.
	  Some complications arose due to the fact that we have pre-emption
	  enabled. Work-arounds were also proposed, but I left the optimization
	  out of this code for now. We can deal with it later and I am
	  interested in perhaps inlining, if size permits, a portion of the
	  `quick allocation' in the macros, once more, but for now I'd rather
	  it remains a function call in all cases.

	- The general allocation and free cases are very fast. Very comparable
	  to what we presently have. The downside is slightly more code due
	  to tougher list manipulation during allocation but then the upside
	  (along with other things that I don't feel like mentionning right now)
	  is that contention is reduced in MP environments. So it comes out
	  about even [*], if not even better now due to other improvements.
	  Unfortunately, I can't really benchmark any of this so please don't
	  ask (although feel free to try yourself and/or to also refer to the
	  documentation available on the Internet discussing advantages of
	  slab allocators in overall performance).

[*] Yeah, this is a very rough inductive prediction, although it is supported
    by many factors. 

	- There is more documentation in kern_mbuf.c itself.

	- mbstats is broken for now. Re-implementing statistics is relatively
	  easy and I left it out of the first version.

	- I have been running this code for over a week now on a dual-proc.
	  CURRENT machine. It has been running fine although I have not tried
	  to forcibly exhaust the maps to see how it behaves as I have not
	  adjusted kmem_malloc() to work exactly as it needs to in order
	  for proper blocking to occur. In case you're wondering exactly
	  what it is that kmem_malloc() needs to be able to do: it just needs
	  to be able to mark a map_starved flag once the map it's allocating
	  from has been exhausted, if that map is mbuf-related (just like it
	  presently does for mb_map). The point is: kmem_malloc() shouldn't
	  block itself if the map is exhausted as it will never wake up because
	  we never free anything back to the map.

	- Free lists in `buckets' (you'll get it if you glance at the code)
	  are implemented not by linking mbufs or clusters in a singly-linked
	  fashion as is done in the current mbuf code. Rather, a pointer-array
	  is maintained (like memcache does). This avoids having the thread
	  freeing the object write to the object (and consequently likely
	  invalidate the producers cache). Unfortunately, this turns out to
	  suck very much for m_ext counters, as they are 4 bytes each. Having
	  a pointer array to a page of 4-byte objects consumes another page
	  on the x86, for instance. I am thinking of alternatives for
	  m_ext counters because of this problem. In fact, I am toying with
	  the idea of replacing allocatable m_ext counters with something
	  else - perhaps storing the counter in some area of the cluster
	  or whatever other provided m_ext buffer (sf_buf, etc.)

	- You'll note that mb_map is subdivided into three other submaps:
	  one for mbufs, one for clusters, one for counters. This is done
	  for several reasons. One is to avoid having cluster allocations
	  consume the portion of the map explicitly reserved for mbufs,
	  thus reducing the total number of allocatable mbufs (because
	  we will never free that space back to the map, once we have consumed
	  it for clusters, it will stay for clusters). Peter Wemm had
	  suggested eliminating mb_map and submaps and having mbuf allocations
	  made directly from mb_map's parent map, kmem_map. The limit for
	  allocations would be limited by a sysctl-tunable variable. Alfred
	  suggested having lazy freeing done back to kmem_map on the behalf
	  of the mbuf subsystem, when needed, in order to avoid having mbufs
	  and clusters completely consume kmem_map. I was toying about
	  implementing this as well but ran into some problems. Notably,
	  in order to be able to take the address of an object being freed
	  and retrieve its corresponding `bucket structure' (again, if you
	  glance at the code, this will make sense), I need to be able
	  to produce a unique index from the address of the page in which
	  the object is found. This is easy to do in the present implementation
	  of mb_slab as each type of object (mbuf or cluster) has its own
	  map, so I can safely do: (page_address) - (base_of_map) and divide
	  by PAGE_SIZE, then use that to index into a pointer-array holding
	  the address of my bucket (memcache does a similar thing, whereas
	  our implementation of kernel malloc() uses a slightly different
	  hashing method). The obvious solution would be to keep a sparse
	  pointer table which would account for every page in kmem_map. But
	  the problem with this is that the resulting table would be a little
	  too large for what it's worth.

  Well, that's about all that I can think of for the moment. I would appreciate
feedback from you folks both on the strictly technical aspects of this Email
as well as on other suggestions you may have regarding the issues presented
above.
  Unfortunately, my final exam period begins next week and I likely will be
unable to do any real work on this until about 2 weeks from now. After exams,
I would like to finish cleaning up this version of the allocator (without
Peter's suggestion and `lazy freeing' implemented yet) and hopefully commit
that first (unless serious objections arise, as usual). Once that's done,
I would then continue to work on adding the latter two, if we can work out
the problems, and whatever else comes up/requires improvement.

Regards,
-- 
 Bosko Milekic
 bmilekic@technokratis.com


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sat May  5  0:49:52 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from mirage.secna.ru (mirage.secna.ru [212.192.0.20])
	by hub.freebsd.org (Postfix) with ESMTP id D075637B424
	for <freebsd-arch@FreeBSD.org>; Sat,  5 May 2001 00:49:40 -0700 (PDT)
	(envelope-from swp@bspu.secna.ru)
Received: from bspu.secna.ru (root@bspu-agtu.bspu.secna.ru [212.192.2.1])
	by mirage.secna.ru (8.9.1/8.9.1-secna) with ESMTP id OAA87550
	for <freebsd-arch@FreeBSD.org>; Sat, 5 May 2001 14:50:21 +0700 (NOVST)
Received: (from root@localhost) by bspu.secna.ru (8.11.3/8.11.3)
    id f457nLR95199 for freebsd-arch@FreeBSD.org; Sat, 5 May 2001 14:49:21 +0700 (NSS)
Date: Sat, 5 May 2001 14:49:21 +0700
From: "mitrohin a.s." <swp@bspu.secna.ru>
To: freebsd-arch@FreeBSD.org
Subject: hmm...
Message-ID: <20010505144921.B93403@bspu.secna.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.4i
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG

list

To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message


From owner-freebsd-arch  Sat May  5 11:34:19 2001
Delivered-To: freebsd-arch@freebsd.org
Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67])
	by hub.freebsd.org (Postfix) with ESMTP id D501B37B424
	for <freebsd-arch@FreeBSD.ORG>; Sat,  5 May 2001 11:34:16 -0700 (PDT)
	(envelope-from dillon@earth.backplane.com)
Received: (from dillon@localhost)
	by earth.backplane.com (8.11.2/8.11.2) id f45IXiW49096;
	Sat, 5 May 2001 11:33:44 -0700 (PDT)
	(envelope-from dillon)
Date: Sat, 5 May 2001 11:33:44 -0700 (PDT)
From: Matt Dillon <dillon@earth.backplane.com>
Message-Id: <200105051833.f45IXiW49096@earth.backplane.com>
To: Bosko Milekic <bmilekic@technokratis.com>
Cc: freebsd-arch@FreeBSD.ORG
Subject: Re: Mbuf slab [new allocator]
References:  <20010503195904.A53281@technokratis.com>
Sender: owner-freebsd-arch@FreeBSD.ORG
Precedence: bulk
X-Loop: FreeBSD.ORG


:Hello,
:
:  Anyone interested in the mbuf subsystem code should probably read this.
:Others may still read it, but it is somewhat longer than your average Email,
:so consider this a warning. :-)
:  Also, although I tried my best to cover most issues here, feel free to let
:me know if I should clarify some points.
:
:  Not so long ago, as I'm sure some of you remember, Alfred committed a patch
:...

    Sounds good.  You know the motto - first make it work, then make it fast.

						-Matt


To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-arch" in the body of the message