From owner-freebsd-smp Mon Jun 17 14:14:55 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA29755 for smp-outgoing; Mon, 17 Jun 1996 14:14:55 -0700 (PDT) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id OAA29750 for ; Mon, 17 Jun 1996 14:14:52 -0700 (PDT) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA08588; Mon, 17 Jun 1996 14:12:28 -0700 From: Terry Lambert Message-Id: <199606172112.OAA08588@phaeton.artisoft.com> Subject: Re: linux 2.0 vs FreeBSD -current's SMP support To: hasty@rah.star-gate.com (Amancio Hasty) Date: Mon, 17 Jun 1996 14:12:28 -0700 (MST) Cc: terry@lambert.org, freebsd-smp@freebsd.org In-Reply-To: <199606171955.MAA12356@rah.star-gate.com> from "Amancio Hasty" at Jun 17, 96 12:55:12 pm X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > Say how stable is the current SMP stuff I noticed your posting about > a week ago or so (make -j)... I have noticed some stability problems with multiple processes going into the kernel. I haven't bothered tracking them down, but it seems related to page cachability -- it may be that the non-cacheable bit is not set on the mutex page, so each processor is getting a stale mutex and thinks it's OK to go in, or more likely, it's a result of the change to the scheduling algorithm from Jack Vogel's original code. You can pretty easily force a panic by going through an interruptible device (ie: an ethernet card) by: 1) open an xterm on another system acting as an X server 2) open a second xterm 3) start a build in one of the windows 4) do file I/O in the other (running elm, whatever). You will get a page-not-present panic, eventually, assuming you have little enough RAM to cause swapping (I have 16M in my system). I'm not horribly impressed with the scheduling algorithm, compared to what it used to be, because of the "jittery nature" of the windows, you only have the opportunity to start both processors in user space once per quantum, and if you have a real active process, one processor can starve for mutex. This is kind of an inevitable result of using "idle processes" instead of a tight scheduler loop to do the work. I was able to get in a state where my console login was running on one processor and the xterm running the "build world" on the other, and the new scheduler prevented me from getting keyboard interrupts from the keyboard because the active out-of-kernel processor was the AP, so the BP starved for opportunity to service interrupts. It looks like this is actually the source of the de0 errors I reported on the -current list a bit ago. In any case, Peter has described a bunch of patches that he has yet to commit to the SMP tree, some of which seem like they will improve things (not nearly so well as removing the idle process accounting would save us from dealing with a constant 2.00 load when nothing was running on the system save the idle procs). I offered to hack on some of this, but Peter's response implied that we should wait for his patches. I'm now looking at the panic Jeffrey Hsu was seeing after his Lite2 integration patches (which have not been merged because of the panic). Once that is done, I can have some real hope of getting my FS patches in, and from there we can start looking at fixing up kernel reentrancy and testing runnable for processes on kernel exit (initially) and go on to multithreading (eventually), at system call trap time. >From what I can tell, the code is at least as stable as the Linux code. Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Wed Jun 19 08:22:53 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id IAA19199 for smp-outgoing; Wed, 19 Jun 1996 08:22:53 -0700 (PDT) Received: from maccs.dcss.mcmaster.ca (maccs.dcss.McMaster.CA [130.113.68.1]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id IAA19194 for ; Wed, 19 Jun 1996 08:22:49 -0700 (PDT) Received: from church.dcss.mcmaster.ca by maccs.dcss.mcmaster.ca with smtp (Smail3.1.28.1 #5) id m0uWP5E-0005tVC; Wed, 19 Jun 96 11:22 EDT Received: by church.dcss.mcmaster.ca (SMI-8.6/SMI-SVR4) id LAA17989; Wed, 19 Jun 1996 11:22:13 -0400 Date: Wed, 19 Jun 1996 11:22:13 -0400 From: dsantry@maccs.dcss.McMaster.CA (Douglas Santry) Message-Id: <199606191522.LAA17989@church.dcss.mcmaster.ca> To: freebsd-smp@freebsd.org Subject: threads Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I have built a threads prototype for FreeBSD. You can malloc mem after creating a thread and the address space is kept consistant etc. I also built a user level threads library which provides the interface to the kernel thread routines, and also implements user mutexes in user level code which I believe are MP safe. I thought since you guys are building a SMP version of FreeBSD, you probably want to build applications that scale to the # of CPUs in the system. Sooooo, if you are interested in trying these threads out, you can get'em at ftp.inna.net under pub/FreeBSD/Threads Lemme know if they are interesting to you! DJS From owner-freebsd-smp Wed Jun 19 10:57:26 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id KAA06055 for smp-outgoing; Wed, 19 Jun 1996 10:57:26 -0700 (PDT) Received: from ref.tfs.com (ref.tfs.com [140.145.254.251]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id KAA06050 for ; Wed, 19 Jun 1996 10:57:24 -0700 (PDT) Received: (from julian@localhost) by ref.tfs.com (8.7.5/8.7.3) id KAA12605; Wed, 19 Jun 1996 10:57:08 -0700 (PDT) Message-Id: <199606191757.KAA12605@ref.tfs.com> Subject: Re: threads To: dsantry@maccs.dcss.McMaster.CA (Douglas Santry) Date: Wed, 19 Jun 1996 10:57:07 -0700 (PDT) From: "JULIAN Elischer" Cc: freebsd-smp@freebsd.org In-Reply-To: <199606191522.LAA17989@church.dcss.mcmaster.ca> from "Douglas Santry" at Jun 19, 96 11:22:13 am X-Mailer: ELM [version 2.4 PL25 ME8b] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > have you looked at the pthreads package that is built into -current as standard? I'm sure we'll be looking very closely at your prototype.. ias we are moving in this direction.. (also the MIT package works for FreeBSD quite well..) > > > I have built a threads prototype for FreeBSD. You can malloc mem after > creating a thread and the address space is kept consistant etc. I also > built a user level threads library which provides the interface to the > kernel thread routines, and also implements user mutexes in user level > code which I believe are MP safe. I thought since you guys are building > a SMP version of FreeBSD, you probably want to build applications that > scale to the # of CPUs in the system. Sooooo, if you are interested in > trying these threads out, you can get'em at > > ftp.inna.net under pub/FreeBSD/Threads > > Lemme know if they are interesting to you! sure! > > DJS > > From owner-freebsd-smp Thu Jun 20 18:00:56 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id SAA01097 for smp-outgoing; Thu, 20 Jun 1996 18:00:56 -0700 (PDT) Received: from bunyip.cc.uq.oz.au (pp@bunyip.cc.uq.oz.au [130.102.2.1]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id SAA01091 for ; Thu, 20 Jun 1996 18:00:53 -0700 (PDT) Received: from bunyip.cc.uq.oz.au by bunyip.cc.uq.oz.au id <15443-0@bunyip.cc.uq.oz.au>; Fri, 21 Jun 1996 11:00:01 +1000 Received: from netfl15a.devetir.qld.gov.au by pandora.devetir.qld.gov.au (8.6.10/DEVETIR-E0.3a) with ESMTP id KAA20898 for ; Fri, 21 Jun 1996 10:23:13 +1000 Received: from localhost by netfl15a.devetir.qld.gov.au (8.6.8.1/DEVETIR-0.1) id AAA09898 for ; Fri, 21 Jun 1996 00:23:08 GMT Message-Id: <199606210023.AAA09898@netfl15a.devetir.qld.gov.au> X-Mailer: exmh version 1.6.5 12/11/95 To: freebsd-smp@freebsd.org Subject: How to get hold of SMP code? X-Face: 3}heU+2?b->-GSF-G4T4>jEB9~FR(V9lo&o>kAy=Pj&;oVOc<|pr%I/VSG"ZD32J>5gGC0N 7gj]^GI@M:LlqNd]|(2OxOxy@$6@/!,";-!OlucF^=jq8s57$%qXd/ieC8DhWmIy@J1AcnvSGV\|*! >Bvu7+0h4zCY^]{AxXKsDTlgA2m]fX$W@'8ev-Qi+-;%L'CcZ'NBL!@n?}q!M&Em3*eW7,093nOeV8 M)(u+6D;%B7j\XA/9j4!Gj~&jYzflG[#)E9sI&Xe9~y~Gn%fA7>F:YKr"Wx4cZU*6{^2ocZ!YyR Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 21 Jun 1996 10:23:03 +1000 From: Stephen Hocking Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk I have a thumping great big new machine here (Dual P5-166, 128Mb mem, 7 x 2Gb disks) and am interested in trying out FreeBSD-smp on it. How do I go about getting the SMP code & integrating it into the base? Stephen -- The views expressed above are not those of the Worker's Compensation Board of Queensland, Australia. From owner-freebsd-smp Fri Jun 21 19:53:53 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id TAA20745 for smp-outgoing; Fri, 21 Jun 1996 19:53:53 -0700 (PDT) Received: from aimnet.com (mailhub.aimnet.com [204.247.0.104]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id TAA20735 for ; Fri, 21 Jun 1996 19:53:49 -0700 (PDT) Received: from iway.aimnet.com (dial-berk1-9.iway.aimnet.com [204.118.73.9]) by aimnet.com (8.7.1/8.7.1) with SMTP id TAA18003; Fri, 21 Jun 1996 19:50:57 -0700 (PDT) Date: Fri, 21 Jun 1996 19:50:57 -0700 (PDT) Message-Id: <199606220250.TAA18003@aimnet.com> X-Sender: jed@mailhub.aimnet.com X-Mailer: Windows Eudora Light Version 1.5.2 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: davidg@Root.COM From: "James E. [Jed] Donnelley" Subject: Re: SMP version? Cc: smp@freebsd.org, jed@llnl.gov Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk At 05:38 PM 6/21/96 -0700, you wrote: >>I am looking for a symmetric multiprocessing version >>of Unix that I can get sources to. Do you know if there >>is such a version of FreeBSD (or Linux? ;-) available? >> >>Sorry for the "out of the blue" message. My application >>is a Scalable Coherent Interface (SCI) shared memory >>multiprocessor research project based on some unique >>optical networking technology at Lawrence Livermore >>National Laboratory. > > We (the FreeBSD project) are just starting our work on SMP support. We >have a working single-kernel-lock implementation and if the sources aren't >available for it now, they will be in a week or two. It's my understanding >that similar progress has been made in Linux, but I'm not associated with >that effort so I don't know its status. > If you're interested, I can put you in touch with the people working on >it...in fact, you can send email to smp@freebsd.org to contact the appropriate >people. > Good luck on your project. Do you (either of you) happen to know if there is a facility in this system (FreeBSD for an SMP) for a single shared memory "multiprocess." That is for multiprocessing on a single shared memory image (with separate register sets)? Is there a defined Posix interface to such memory sharing (beyond the mechanism that I have seen in System V)? Any documentation that you could point us at on this topic would be appreciated. I am trying to estimate the cost of using FreeBSD to support such shared memory multiprocessing on an Intel/SCI based shared memory multiprocessor. We need to have the ability to run a single "job" using shared memory on multiple processes to make the effort worthwhile. Assuming this sort of work would make sense, is there a community that we could collaborate with and potentially contribute code to? Thanks for any reply. --Jed http://www.webstart.com/cc/jed-signature.html From owner-freebsd-smp Fri Jun 21 19:58:12 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id TAA20918 for smp-outgoing; Fri, 21 Jun 1996 19:58:12 -0700 (PDT) Received: from critter.tfs.com ([140.145.16.108]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id TAA20912; Fri, 21 Jun 1996 19:58:08 -0700 (PDT) Received: from critter.tfs.com (localhost [127.0.0.1]) by critter.tfs.com (8.7.5/8.7.3) with ESMTP id TAA04172; Fri, 21 Jun 1996 19:57:13 -0700 (PDT) To: "James E. [Jed] Donnelley" cc: davidg@Root.COM, smp@freebsd.org, jed@llnl.gov Subject: Re: SMP version? In-reply-to: Your message of "Fri, 21 Jun 1996 19:50:57 PDT." <199606220250.TAA18003@aimnet.com> Date: Fri, 21 Jun 1996 19:57:11 -0700 Message-ID: <4170.835412231@critter.tfs.com> From: Poul-Henning Kamp Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk >> We (the FreeBSD project) are just starting our work on SMP support. We >>have a working single-kernel-lock implementation and if the sources aren't >>available for it now, they will be in a week or two. The sources are available. Via sup or ctm even :-) check the mail-archive for smp on freebsd.org -- Poul-Henning Kamp | phk@FreeBSD.ORG FreeBSD Core-team. http://www.freebsd.org/~phk | phk@login.dknet.dk Private mailbox. whois: [PHK] | phk@ref.tfs.com TRW Financial Systems, Inc. Future will arrive by its own means, progress not so. From owner-freebsd-smp Fri Jun 21 20:35:16 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id UAA23225 for smp-outgoing; Fri, 21 Jun 1996 20:35:16 -0700 (PDT) Received: from abash1.microsoft.com (abash1.microsoft.com [131.107.3.23]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id UAA23191 for ; Fri, 21 Jun 1996 20:35:11 -0700 (PDT) Received: by abash1.microsoft.com with Microsoft Exchange (IMC 4.0.838.14) id <01BB5FB1.1CD1C420@abash1.microsoft.com>; Fri, 21 Jun 1996 20:34:58 -0700 Message-ID: From: Thomas Pfenning To: "'davidg@Root.COM'" , "'James E. [Jed] Donnelley'" Cc: "'smp@freebsd.org'" , "'jed@llnl.gov'" Subject: RE: SMP version? Date: Fri, 21 Jun 1996 20:34:40 -0700 X-Mailer: Microsoft Exchange Server Internet Mail Connector Version 4.0.838.14 Encoding: 61 TEXT Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk You might want to have a look at the work Ron Minnich is doing with FreeBSD and his distributed shared memory implementation. ftp://ftp.sarnoff.com/pub/mnfs/www/docs/cluster.html Cheers Thomas >---------- >From: James E. [Jed] Donnelley[SMTP:jed@webstart.com] >Sent: Friday, June 21, 1996 7:50 PM >To: davidg@Root.COM >Cc: smp@freebsd.org; jed@llnl.gov >Subject: Re: SMP version? > >At 05:38 PM 6/21/96 -0700, you wrote: >>>I am looking for a symmetric multiprocessing version >>>of Unix that I can get sources to. Do you know if there >>>is such a version of FreeBSD (or Linux? ;-) available? >>> >>>Sorry for the "out of the blue" message. My application >>>is a Scalable Coherent Interface (SCI) shared memory >>>multiprocessor research project based on some unique >>>optical networking technology at Lawrence Livermore >>>National Laboratory. >> >> We (the FreeBSD project) are just starting our work on SMP support. We >>have a working single-kernel-lock implementation and if the sources aren't >>available for it now, they will be in a week or two. It's my understanding >>that similar progress has been made in Linux, but I'm not associated with >>that effort so I don't know its status. >> If you're interested, I can put you in touch with the people working on >>it...in fact, you can send email to smp@freebsd.org to contact the >>appropriate >>people. >> Good luck on your project. > >Do you (either of you) happen to know if there is a facility in >this system (FreeBSD for an SMP) for a single shared memory >"multiprocess." That is for multiprocessing on a single shared >memory image (with separate register sets)? Is there a defined >Posix interface to such memory sharing (beyond the mechanism that >I have seen in System V)? Any documentation that you could point >us at on this topic would be appreciated. > >I am trying to estimate the cost of using FreeBSD to support such >shared >memory multiprocessing on an Intel/SCI based shared memory >multiprocessor. >We need to have the ability to run a single "job" using shared >memory on multiple processes to make the effort worthwhile. > >Assuming this sort of work would make sense, is there a community >that we could collaborate with and potentially contribute code to? > >Thanks for any reply. > >--Jed http://www.webstart.com/cc/jed-signature.html > > From owner-freebsd-smp Sat Jun 22 00:17:25 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id AAA08089 for smp-outgoing; Sat, 22 Jun 1996 00:17:25 -0700 (PDT) Received: from aimnet.com (mailhub.aimnet.com [204.247.0.104]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id AAA08078 for ; Sat, 22 Jun 1996 00:17:23 -0700 (PDT) Received: from iway.aimnet.com (dial-berk1-15.iway.aimnet.com [204.118.73.15]) by aimnet.com (8.7.1/8.7.1) with SMTP id AAA27881; Sat, 22 Jun 1996 00:14:28 -0700 (PDT) Date: Sat, 22 Jun 1996 00:14:28 -0700 (PDT) Message-Id: <199606220714.AAA27881@aimnet.com> X-Sender: jed@mailhub.aimnet.com X-Mailer: Windows Eudora Light Version 1.5.2 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: Ron Minnich From: "James E. [Jed] Donnelley" Subject: RE: SMP version? Cc: Thomas Pfenning , davidg@Root.COM, smp@freebsd.org, jed@llnl.gov, mail@ppgsoft.com Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk At 08:34 PM 6/21/96 -0700, Thomas Pfenning wrote: >You might want to have a look at the work Ron Minnich is doing with >FreeBSD and his distributed shared memory implementation. > >ftp://ftp.sarnoff.com/pub/mnfs/www/docs/cluster.html > >Cheers > > Thomas As you can learn in a bit greater detail below, we at LLNL are considering using an open version of Unix (e.g. FreeBSD or Linux) for an SMP running on multiple Intel processors connected via Scalable Coherent Interface: http://www.cmpcmm.com/cc/standards.html#SCI I was referred to your work as above. I did look at your Web page as noted. The focus there seems to be on message passing (e.g. MPI?). Did I read that incorrectly? We are currently focusing on shared memory. As you will read below, our project will only be worthwhile if we can run a multiprocessing application with multiple processors sharing a common memory image (different register sets - e.g. as the Unicos model). We are not interested in pursuing "virtual" shared memory at this time (though I would be interested to hear of any work you have done in this area - particularly performance studies). We are trying to determine how much work it will be to get "there" from "here" using SCI (over our in-house developed optical network). I have previously developed such an operating system from scratch, but would naturally hope to be able to get such a system running from a FreeBSD or Linux base with much (!) less effort. Any thoughts from your experience that you would be willing to share would be greatly appreciated. >>From: James E. [Jed] Donnelley[SMTP:jed@webstart.com] >>Sent: Friday, June 21, 1996 7:50 PM >>To: davidg@Root.COM >>Cc: smp@freebsd.org; jed@llnl.gov >>Subject: Re: SMP version? >> >>At 05:38 PM 6/21/96 -0700, you wrote: >>>>I am looking for a symmetric multiprocessing version >>>>of Unix that I can get sources to. Do you know if there >>>>is such a version of FreeBSD (or Linux? ;-) available? >>>> >>>>Sorry for the "out of the blue" message. My application >>>>is a Scalable Coherent Interface (SCI) shared memory >>>>multiprocessor research project based on some unique >>>>optical networking technology at Lawrence Livermore >>>>National Laboratory. >>> >>> We (the FreeBSD project) are just starting our work on SMP support. We >>>have a working single-kernel-lock implementation and if the sources aren't >>>available for it now, they will be in a week or two. It's my understanding >>>that similar progress has been made in Linux, but I'm not associated with >>>that effort so I don't know its status. >>> If you're interested, I can put you in touch with the people working on >>>it...in fact, you can send email to smp@freebsd.org to contact the >>>appropriate >>>people. >>> Good luck on your project. >> >>Do you (either of you) happen to know if there is a facility in >>this system (FreeBSD for an SMP) for a single shared memory >>"multiprocess." That is for multiprocessing on a single shared >>memory image (with separate register sets)? Is there a defined >>Posix interface to such memory sharing (beyond the mechanism that >>I have seen in System V)? Any documentation that you could point >>us at on this topic would be appreciated. >> >>I am trying to estimate the cost of using FreeBSD to support such >>shared >>memory multiprocessing on an Intel/SCI based shared memory >>multiprocessor. >>We need to have the ability to run a single "job" using shared >>memory on multiple processes to make the effort worthwhile. >> >>Assuming this sort of work would make sense, is there a community >>that we could collaborate with and potentially contribute code to? Thanks for any reply. --Jed http://www.webstart.com/cc/jed-signature.html From owner-freebsd-smp Sat Jun 22 01:43:07 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id BAA14410 for smp-outgoing; Sat, 22 Jun 1996 01:43:07 -0700 (PDT) Received: from aimnet.com (mailhub.aimnet.com [204.247.0.104]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id BAA14403 for ; Sat, 22 Jun 1996 01:43:04 -0700 (PDT) Received: from iway.aimnet.com (dial-berk1-20.iway.aimnet.com [204.118.73.20]) by aimnet.com (8.7.1/8.7.1) with SMTP id BAA00118; Sat, 22 Jun 1996 01:40:11 -0700 (PDT) Date: Sat, 22 Jun 1996 01:40:11 -0700 (PDT) Message-Id: <199606220840.BAA00118@aimnet.com> X-Sender: jed@mailhub.aimnet.com X-Mailer: Windows Eudora Light Version 1.5.2 Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" To: Thomas Pfenning From: "James E. [Jed] Donnelley" Subject: RE: SMP version? Cc: Ron Minnich , davidg@Root.COM, smp@freebsd.org, mail@ppgsoft.com Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk At 12:33 AM 6/22/96 -0700, you wrote: >Without getting in what Ron has to say but how could you get the >impression this work is on message passing? He is working on shared >memory for years now which is pretty clearly documented on his WEB site. >What do you mean you want to run the UNICOS model but not virtual shared >memory. SCI implements virtual shared memory when I understood it >correctly. > >Don't worry, I was just getting a bit confused by your mail. No problem. I will try to explain how I see it. Looking again at Ron's Web site, I can see the references to "Distributed Shared memory" and "virtual shared memory". I am not sure just what these terms mean, but I have a guess. I don't see anything on the page that refers to memory sharing hardware. My guess is that what he is working on is essentially what SGI refers to in their shared memory model - namely essentially paging through the network. You set up a shared memory map between processors (this is a part I am interested in) so that if a process on one machine references some memory, a copy of that memory is moved to that machine. If the reference is a write then all other copies are invalidated. If it is "another" read then copies can reside read-only on multiple machines. I am just hacking this wording out, but the model is pretty tried and true. It can work reasonably well if there is no significant distributed "simultaneous" reading and writing of the same memory. What we are working on is more like "real" SMP shared memory. Namely with latencies in at least the 1-10 microsecond range and hardware support for "remote" reads and writes through SCI - passing individual cache lines and blocking instruction execution while waiting for "remote" memory. Perhaps this is what Ron is doing and I just didn't catch it at first glance. If so, it was a simple (perhaps inept) misunderstanding. I don't see how he can be doing remote reads and writes of cache lines over 100BaseT. It would seem that at least some hardware to "redirect" memory references is needed. I didn't see any reference to that. Reducing the software overhead of remote memory access is one of the key areas that we want to explore. Of course even 1-10 microseconds is pretty nonuniform as memory access goes these days, but I believe it is worth exploring. My comment about message passing and MPI was inappropriate. It seems clear that he is not going down that path. If he is doing remote memory "paging" with software and getting latencies in the 1-10 microsecond range then I will be interested to read further to see how he does it. In that case perhaps "real" shared memory is not needed... - "virtual" shared memory may be enough for most applications. I guess some applications may push memory uniformity down to the 10-100 nanosecond range, but I would guess that such applications are somewhat unusual. If he is doing remote memory accesses and running a "standard" (i.e. dumb - like Unicos) SMP operating system on his machines then we will be happy to run with his software (should it be an available "open" Unix version such as FreeBSD and/or Linux). We are getting quick turn around in this dialog, but perhaps the clarity is suffering a bit - my apologies. I'll download the postscript from some of Ron's papers and try to clarify my understand of what he has been doing. It will probably be next week before I can get to it. I am working only a very small portion of my time on this effort (for some other folks) and trying to bore directly to my needed answer to see if further effort is justified. Thanks for the quick responce. I hope it helps clarify what I am looking for. Sorry for not being clearer. --Jed http://www.webstart.com/cc/jed-signature.html From owner-freebsd-smp Sat Jun 22 14:33:15 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id OAA10884 for smp-outgoing; Sat, 22 Jun 1996 14:33:15 -0700 (PDT) Received: from phaeton.artisoft.com (phaeton.Artisoft.COM [198.17.250.211]) by freefall.freebsd.org (8.7.5/8.7.3) with SMTP id OAA10865 for ; Sat, 22 Jun 1996 14:33:10 -0700 (PDT) Received: (from terry@localhost) by phaeton.artisoft.com (8.6.11/8.6.9) id OAA22764; Sat, 22 Jun 1996 14:27:35 -0700 From: Terry Lambert Message-Id: <199606222127.OAA22764@phaeton.artisoft.com> Subject: Re: SMP version? To: jed@webstart.com (James E. [Jed] Donnelley) Date: Sat, 22 Jun 1996 14:27:35 -0700 (MST) Cc: rminnich@sarnoff.com, thomaspf@microsoft.com, davidg@Root.COM, smp@freebsd.org, jed@llnl.gov, mail@ppgsoft.com In-Reply-To: <199606220714.AAA27881@aimnet.com> from "James E. [Jed] Donnelley" at Jun 22, 96 00:14:28 am X-Mailer: ELM [version 2.4 PL24] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk This is *VERY* exciting stuff! > As you can learn in a bit greater detail below, we at LLNL are > considering using an open version of Unix (e.g. FreeBSD or Linux) > for an SMP running on multiple Intel processors connected > via Scalable Coherent Interface: > > http://www.cmpcmm.com/cc/standards.html#SCI Ah. I didn't realize that LAMP used SCI (I've heard of LAMP). It looks like what you are talking about is cache-miss-based transport-using page fetching... otherwise known as a distributed cache coherency system. 8-). The difference is that the memory sharing is implemented by cache miss instead of by explicit reference, right, so it's transparent? > I was referred to your work as above. I did look at your > Web page as noted. The focus there seems to be on message passing > (e.g. MPI?). Did I read that incorrectly? We are currently > focusing on shared memory. As you will read below, our > project will only be worthwhile if we can run a multiprocessing > application with multiple processors sharing a common memory > image (different register sets - e.g. as the Unicos model). > We are not interested in pursuing "virtual" shared memory > at this time (though I would be interested to hear of any > work you have done in this area - particularly performance > studies). The Sarnoff work is in cluster computing; this is, indeed, different, since it implies some scheduling and other assymetry that it looks like (from the WWW reference I could find) that SCI would not have. There are two implementations of DSM (distributed shared memory, for the list archive readers) fro FreeBSD. Probably the best known is the modified NFS with distributed cache coherency. A miss from the vnode pager on the remote NFS mounted vnode causes a page replacement via the net. This is a lot less ambitious, than an SCI implementation, and probably performs at a lower level -- though not significantly lower, since you can argue transport latency. > We are trying to determine how much work it will be to get > "there" from "here" using SCI (over our in-house developed > optical network). I have previously developed such an > operating system from scratch, but would naturally hope > to be able to get such a system running from a FreeBSD > or Linux base with much (!) less effort. Any thoughts > from your experience that you would be willing to share > would be greatly appreciated. I guess I'm still a little confused where "there" ends up being... are you interested in providing SCI interconnect between SMP boxes, or are you interested in SCI interconnect of uniprocessor systems in order to *build* SMP boxes... or are you trying to build *large* SMP boxes from multiple small SMP boxes, etc.? Arguably, from the decriptions of SCI, it looks like you could build a large scalle distributed dataflow architecture... is this your intent, or are you working on LAMP, etc.? I think FreeBSD would be a good choice here for a number of reasons, since all of these are possible directions from the existing code base. Actually, someone (probably John Dyson) needs to write up a VM architecture description; here are some high points, however: o Unified VM/buffer cache Lack of cache unification on a system would be, I think, a primary obstacle to implementing SCI coherency. You would need to implement local coherency as well so that a buffer page miss did the right thing. One of the biggest benfits is the avoidance of a bmap() for each kernel reference of user pages. o Memory pages are referenced from files by vnode/offset This reference model has advantages for cache-based distributed reference; the SCI interconnect could be conceivably implemented as a file system layer using the vnode pager; this would not be the most efficient implementation, but it would be an easy to approach prototype interface to let you hit the ground running. In addition, though the vnode/offest mapping model has a number of drawbacks relative to premature page discarding (which are solvable, given some work on the /sys/kern/vfs_subr.c to kill vclean), it would be relatively easy to add Sun-style VOP_GETPAGE, VOP_PUTPAGE operations to the FS for reference-based cache miss detection (based on the SCI transport indication of a stale page) FreeBSD uses a modified zone allocation policy for kernel memory allocation. Each call to the kernel "malloc" routine takes a zone designator, similar to that used by the Mach VM system. The zone allocation takes place in what are, effectively, SLAB page-based allocations (using kmem_malloc). It isn't a real full SLAB allocation because of bitmap embedding, but it's close enough that conversion would be pretty simple. The use of a zone-based SLAB allocator is actually a significant win of a standard SLAB allocator because of object type memory persistance being relatively equivalent anywhere in a zone. It could be improved by providing allocator persistance hits, or by segmenting the address space based on, for instance, a one byte segment identification decode, or simple short/medium/long tagging, but as it is, the zoning provides significant protection from kernel memory fragmentation on non-page boundries (which you might see with a standard SLAB allocator, such as those used by Solaris and SVR4). FreeBSD, admittedly, could use some work on SLAB managedment, but that's trivial code, on the order of hash management (ie: transcription of Knuth into code, like everyone else does it). In addition, the seperation into zones allows you to to flag the zone identifiers (which has not been done in the current code) to determine whether the allocated resource is local to a processor or should be allocated globally. This is a potentially significant win for scalability. The loss in scalability of Intel processors which led to the 5 bit APIC ID limitation was the standard "diminishing returns" argument for bus contention; however Sequent was able to overcome this limitation with a clever design, which I don't think gets sufficient creit for the MP case in the Vahalia book. What Sequent did was establish a per processor page pool with high and low water marks, from which pages are preferentially taken for a processor's page allocation requests. The page pool is refilled at the low water mark, or emtied at the high water mark. This page pool banding means that the THE PROCESSOR DOES NOT NEED TO HOLD THE GLOBAL MUTEX TO GET PAGES. This allowed SEquent to hit the full 32 processor APIC limit without significantly damaging their scalability at the traditionally predicted 8 processor limit. Finally, there is work under way (by John Dyson) to support shared process address space; this is similar to the Unicos model, which you reference -- though, obviously, you would need to deal with the hard page table entries on multiple processors to trigger the SCI based page-level cache coherency. This started with a Sequent-style "sfork" implementation. John is in possession of some kernel threading code (from another engineer) which operates on a partial sharing model, that he is converting to a full sharing model: he said that he thinks our cost per thread will be the cost of a process in the kernel (proc, upages, minor etc.), saving the per process page table pages using VM space sharing. So I think no matter what direction you are actually going in, FreeBSD is pretty much poised to help you out. (John, David, Poul, folks -- correct me if I've mangled something) Regards, Terry Lambert terry@lambert.org --- Any opinions in this posting are my own and not those of my present or previous employers. From owner-freebsd-smp Sat Jun 22 16:14:05 1996 Return-Path: owner-smp Received: (from root@localhost) by freefall.freebsd.org (8.7.5/8.7.3) id QAA15169 for smp-outgoing; Sat, 22 Jun 1996 16:14:05 -0700 (PDT) Received: from ref.tfs.com (ref.tfs.com [140.145.254.251]) by freefall.freebsd.org (8.7.5/8.7.3) with ESMTP id QAA15164 for ; Sat, 22 Jun 1996 16:14:03 -0700 (PDT) Received: (from julian@localhost) by ref.tfs.com (8.7.5/8.7.3) id QAA13836; Sat, 22 Jun 1996 16:13:48 -0700 (PDT) Message-Id: <199606222313.QAA13836@ref.tfs.com> Subject: Re: SMP version? To: jed@webstart.com (James E. [Jed] Donnelley) Date: Sat, 22 Jun 1996 16:13:47 -0700 (PDT) From: "JULIAN Elischer" Cc: davidg@Root.COM, smp@freebsd.org, jed@llnl.gov In-Reply-To: <199606220250.TAA18003@aimnet.com> from "James E. [Jed] Donnelley" at Jun 21, 96 07:50:57 pm X-Mailer: ELM [version 2.4 PL25 ME8b] MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-smp@freebsd.org X-Loop: FreeBSD.org Precedence: bulk > > At 05:38 PM 6/21/96 -0700, you wrote: > >>I am looking for a symmetric multiprocessing version > >>of Unix that I can get sources to. Do you know if there > >>is such a version of FreeBSD (or Linux? ;-) available? > >> > >>Sorry for the "out of the blue" message. My application > >>is a Scalable Coherent Interface (SCI) shared memory > >>multiprocessor research project based on some unique > >>optical networking technology at Lawrence Livermore > >>National Laboratory. > > > > We (the FreeBSD project) are just starting our work on SMP support. We > >have a working single-kernel-lock implementation and if the sources aren't > >available for it now, they will be in a week or two. It's my understanding > >that similar progress has been made in Linux, but I'm not associated with > >that effort so I don't know its status. > > If you're interested, I can put you in touch with the people working on > >it...in fact, you can send email to smp@freebsd.org to contact the appropriate > >people. > > Good luck on your project. > > Do you (either of you) happen to know if there is a facility in > this system (FreeBSD for an SMP) for a single shared memory > "multiprocess." That is for multiprocessing on a single shared > memory image (with separate register sets)? Is there a defined > Posix interface to such memory sharing (beyond the mechanism that > I have seen in System V)? Any documentation that you could point > us at on this topic would be appreciated. yes and no.... there is rfork() which allows a process to share all it's exising address space with it's child. in the SMP version the two processess could be on separate processors of course.. there are changes coming up to allow the complete sharinf of address space including regions not yet allocated, but they are still germinating.. failing that, some changes were posted a few days ago that allowed shared memory operations between independently schedulable processes using "Threads" semantics. these were also 'preliminary' Ron Minnich and several others are doing some parallel computing work.. > > I am trying to estimate the cost of using FreeBSD to support such shared > memory multiprocessing on an Intel/SCI based shared memory multiprocessor. > We need to have the ability to run a single "job" using shared > memory on multiple processes to make the effort worthwhile. > > Assuming this sort of work would make sense, is there a community > that we could collaborate with and potentially contribute code to? I'm sure there is.. > > Thanks for any reply. > > --Jed http://www.webstart.com/cc/jed-signature.html > >