From owner-freebsd-smp Sun Jul 1 2: 0:41 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id 4F23E37B401; Sun, 1 Jul 2001 02:00:38 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id 7AD675D010; Sun, 1 Jul 2001 04:00:34 -0500 (CDT) Date: Sun, 1 Jul 2001 04:00:34 -0500 From: Alfred Perlstein To: "E.B. Dreger" Cc: "Michael C . Wu" , freebsd-smp@freebsd.org, freebsd-hackers@freebsd.org Subject: Re: Quick question: AIO / SMP / process-based threading Message-ID: <20010701040034.G84523@sneakerz.org> References: <20010630222829.E84523@sneakerz.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: ; from eddy+public+spam@noc.everquick.net on Sun, Jul 01, 2001 at 03:51:10AM +0000 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * E.B. Dreger [010630 22:51] wrote: > > Date: Sat, 30 Jun 2001 22:28:29 -0500 > > From: Alfred Perlstein > > > > Can you point to some specific PRs about this or crashdumps before > > (or at least while) taking pot shots at the AIO implementation? > > In the mean time, until somebody can substantiate that claim... is AIO SMP > safe? I see that aiocb.aio_buf is declared as "volatile", so I would > presume so. > > I just want to be sure that, if an aio call runs on one CPU, another CPU > can access *aio_buf and be 100% certain that the data are coherent. > It should be, if you experience any problems please let us know. > aio_buf = mmap() using MAP_HASSEMAPHORE -- good idea, bad idea, pointless? Pointless I think. -- -Alfred Perlstein [alfred@freebsd.org] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 3:51:23 2001 Delivered-To: freebsd-smp@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 3377E37B406; Sun, 1 Jul 2001 03:51:16 -0700 (PDT) (envelope-from ticso@mail.cicely.de) Received: from mail.cicely.de (cicely20 [10.1.1.22]) by srv1.cosmo-project.de (8.11.0/8.11.0) with ESMTP id f61Aow671409; Sun, 1 Jul 2001 12:51:01 +0200 (CEST) Received: (from ticso@localhost) by mail.cicely.de (8.11.0/8.11.0) id f61ApMm22390; Sun, 1 Jul 2001 12:51:22 +0200 (CEST) Date: Sun, 1 Jul 2001 12:51:16 +0200 From: Bernd Walter To: "E.B. Dreger" Cc: Alfred Perlstein , "Michael C . Wu" , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Quick question: AIO / SMP / process-based threading Message-ID: <20010701125116.B22242@cicely20.cicely.de> References: <20010630222829.E84523@sneakerz.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from eddy+public+spam@noc.everquick.net on Sun, Jul 01, 2001 at 03:51:10AM +0000 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Sun, Jul 01, 2001 at 03:51:10AM +0000, E.B. Dreger wrote: > > Date: Sat, 30 Jun 2001 22:28:29 -0500 > > From: Alfred Perlstein > > > > Can you point to some specific PRs about this or crashdumps before > > (or at least while) taking pot shots at the AIO implementation? > > In the mean time, until somebody can substantiate that claim... is AIO SMP > safe? I see that aiocb.aio_buf is declared as "volatile", so I would > presume so. Volatile isn't an inter CPU thing and handles only register caching created by the compiler but not memory caching which is done out of the compilers control. If you want inter CPU chorency you have to handle both. > I just want to be sure that, if an aio call runs on one CPU, another CPU > can access *aio_buf and be 100% certain that the data are coherent. If you setup *aiobuf from the same executing context as you start the aio_call you will be save. That means if the kernel decides to do the work behind using another CPU it has to enshure coherency. As long as the aio_ call is in progress you shouldn't modify the *aiobuf anyway so that's not an issue. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 5:55:25 2001 Delivered-To: freebsd-smp@freebsd.org Received: from segfault.kiev.ua (segfault.kiev.ua [193.193.193.4]) by hub.freebsd.org (Postfix) with ESMTP id 3AD8137B401; Sun, 1 Jul 2001 05:55:16 -0700 (PDT) (envelope-from netch@iv.nn.kiev.ua) Received: (from uucp@localhost) by segfault.kiev.ua (8) with UUCP id PVB32996; Sun, 1 Jul 2001 15:54:46 +0300 (EEST) (envelope-from netch@iv.nn.kiev.ua) Received: (from netch@localhost) by iv.nn.kiev.ua (8.11.4/8.11.4) id f61CnX801110; Sun, 1 Jul 2001 15:49:33 +0300 (EEST) (envelope-from netch) Date: Sun, 1 Jul 2001 15:49:33 +0300 From: Valentin Nechayev To: "E.B. Dreger" Cc: Terry Lambert , Chris Costello , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? Message-ID: <20010701154933.B376@iv.nn.kiev.ua> References: <3B3C3346.E5496485@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ; from eddy+public+spam@noc.everquick.net on Fri, Jun 29, 2001 at 03:19:47PM +0000 X-42: On Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Fri, Jun 29, 2001 at 15:19:47, eddy+public+spam (E.B. Dreger) wrote about "Re: libc_r locking... why?": > Running processes on multiple CPUs is one goal. > > [ libc_r locks don't assert "lock", not MP-safe ] > > So the "lock" prefix is the only way to enforce cache coherency? > Do you have handy a good reference on IPIs, other than the kernel > APIC code (and, of course, Google and NorthernLight searches)? You say about i386 architecture, don't you? All current and former processors has AFAIK strict write ordering, and cache coherency is provided for all data. If lock is obtained, it means that all previous data is written successfully and caches of other processors are notified for your changes. Use locks and don't worry for cache coherency, it is already done. > Good to know, but, I'm not using libc_r... I was looking at > existing code to help me double-check mine as I go. I'm > synchronizing processes with a "giant lock" token that each > process cooperatively passes to the next... to simplify: Did you check your code with some proven to be working between processes, e.g. SysV IPC semaphores? /netch To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 5:55:47 2001 Delivered-To: freebsd-smp@freebsd.org Received: from segfault.kiev.ua (segfault.kiev.ua [193.193.193.4]) by hub.freebsd.org (Postfix) with ESMTP id 6B96937B405; Sun, 1 Jul 2001 05:55:37 -0700 (PDT) (envelope-from netch@iv.nn.kiev.ua) Received: (from uucp@localhost) by segfault.kiev.ua (8) with UUCP id PVH33005; Sun, 1 Jul 2001 15:55:21 +0300 (EEST) (envelope-from netch@iv.nn.kiev.ua) Received: (from netch@localhost) by iv.nn.kiev.ua (8.11.4/8.11.4) id f61Cqun01162; Sun, 1 Jul 2001 15:52:56 +0300 (EEST) (envelope-from netch) Date: Sun, 1 Jul 2001 15:52:56 +0300 From: Valentin Nechayev To: "E.B. Dreger" Cc: Bernd Walter , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? Message-ID: <20010701155256.C376@iv.nn.kiev.ua> References: <20010629211818.A17309@cicely20.cicely.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ; from eddy+public+spam@noc.everquick.net on Fri, Jun 29, 2001 at 07:56:40PM +0000 X-42: On Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Fri, Jun 29, 2001 at 19:56:40, eddy+public+spam (E.B. Dreger) wrote about "Re: libc_r locking... why?": > > A Token may not be enough because writes may be reordered. AFAIK it's false for i386 architecture. Please correct me if needed. > Here is where I want to learn more about cache coherency, inter-processor > interrupts, and APIC programming. I'd imagine that the latter two are > lower-level than I'd be using, but I still want to know the "how and why" > beneath the scenes. Did you try to read MP chipsets white papers? /netch To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 7: 7:39 2001 Delivered-To: freebsd-smp@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 8605137B403; Sun, 1 Jul 2001 07:07:31 -0700 (PDT) (envelope-from ticso@mail.cicely.de) Received: from mail.cicely.de (cicely20 [10.1.1.22]) by srv1.cosmo-project.de (8.11.0/8.11.0) with ESMTP id f61E7H672239; Sun, 1 Jul 2001 16:07:19 +0200 (CEST) Received: (from ticso@localhost) by mail.cicely.de (8.11.0/8.11.0) id f61E7it22706; Sun, 1 Jul 2001 16:07:44 +0200 (CEST) Date: Sun, 1 Jul 2001 16:07:38 +0200 From: Bernd Walter To: Valentin Nechayev Cc: "E.B. Dreger" , Bernd Walter , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? Message-ID: <20010701160738.A22683@cicely20.cicely.de> References: <20010629211818.A17309@cicely20.cicely.de> <20010701155256.C376@iv.nn.kiev.ua> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010701155256.C376@iv.nn.kiev.ua>; from netch@iv.nn.kiev.ua on Sun, Jul 01, 2001 at 03:52:56PM +0300 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Sun, Jul 01, 2001 at 03:52:56PM +0300, Valentin Nechayev wrote: > Fri, Jun 29, 2001 at 19:56:40, eddy+public+spam (E.B. Dreger) wrote about "Re: libc_r locking... why?": > > > > A Token may not be enough because writes may be reordered. > > AFAIK it's false for i386 architecture. Please correct me if needed. In -currents NOTEs I found this: # CPU_DISABLE_5X86_LSSER disables load store serialize (i.e. enables # reorder). This option should not be used if you use memory mapped # I/O device(s). A good sign that it may be at least possible on some CPUs. OK that's not an MP capable CPU. What you need is an x86 guru or asume worst which will be the best thing anyway - otherwise you can't use it on other machines and sometimes programms get very old. I also don't know what the following is: # CPU_WT_ALLOC enables write allocation on Cyrix 6x86/6x86MX and AMD # K5/K6/K6-2 cpus. > > Here is where I want to learn more about cache coherency, inter-processor > > interrupts, and APIC programming. I'd imagine that the latter two are > > lower-level than I'd be using, but I still want to know the "how and why" > > beneath the scenes. > > Did you try to read MP chipsets white papers? I can't say very much about coherency problems on x86 but I can say for shure that you have to worry about this on every other MP platform including IA64. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 8:53:20 2001 Delivered-To: freebsd-smp@freebsd.org Received: from a.mx.everquick.net (a.mx.everquick.net [216.89.137.3]) by hub.freebsd.org (Postfix) with ESMTP id BB32237B405; Sun, 1 Jul 2001 08:53:12 -0700 (PDT) (envelope-from eddy+public+spam@noc.everquick.net) Received: from localhost (eddy@localhost) by a.mx.everquick.net (8.10.2/8.10.2) with ESMTP id f61FqlO01273; Sun, 1 Jul 2001 15:52:47 GMT X-EverQuick-No-Abuse: Report any e-mail abuse to Date: Sun, 1 Jul 2001 15:52:47 +0000 (GMT) From: "E.B. Dreger" To: Bernd Walter Cc: Valentin Nechayev , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? In-Reply-To: <20010701160738.A22683@cicely20.cicely.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > Date: Sun, 1 Jul 2001 16:07:38 +0200 > From: Bernd Walter > In -currents NOTEs I found this: > # CPU_DISABLE_5X86_LSSER disables load store serialize (i.e. enables > # reorder). This option should not be used if you use memory mapped > # I/O device(s). > > A good sign that it may be at least possible on some CPUs. > OK that's not an MP capable CPU. This is an encouraging starting point... at least the issue is similar. It's also in 4.3-R, so I can grep kernel source. > What you need is an x86 guru or asume worst which will be the best > thing anyway - otherwise you can't use it on other machines and > sometimes programms get very old. I thought that one had to assert "lock" to guarantee cache coherency... an ugly hack would be to movl (%pagebase,%index,1),%eax lock movl %eax,(%pagebase,%index,1) for every cache line in a page. Ugly and slow... I'd much rather find out if there's a way to tell the chipset "flush all pending writes in this block, and ensure that both CPUs have the same view". > I also don't know what the following is: > # CPU_WT_ALLOC enables write allocation on Cyrix 6x86/6x86MX and AMD > # K5/K6/K6-2 cpus. Hmmmm. Being concerned about x86 SMP, I've overlooked anything non-Intel. Might be worth checking out what's there... I've oft learned what I wanted via an indirect route. > > Did you try to read MP chipsets white papers? No. I guess that I can give that a shot. > I can't say very much about coherency problems on x86 but I can > say for shure that you have to worry about this on every other MP > platform including IA64. Even if it's a non-issue on x86, I'd rather use macros to insert proper code on ia64, axp (if I ever port to that), and go to nothing on x86 (if that is indeed the correct behavior). Looks like I need to do some digging on bus snooping, cache coherency, read/write reordering, MTRRs, and APICs. :-) Eddy --------------------------------------------------------------------------- Brotsman & Dreger, Inc. EverQuick Internet Division Phone: +1 (316) 794-8922 Wichita/(Inter)national Phone: +1 (785) 865-5885 Lawrence --------------------------------------------------------------------------- Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to , or you are likely to be blocked. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 10:16: 9 2001 Delivered-To: freebsd-smp@freebsd.org Received: from a.mx.everquick.net (a.mx.everquick.net [216.89.137.3]) by hub.freebsd.org (Postfix) with ESMTP id 0B7A837B407; Sun, 1 Jul 2001 10:16:00 -0700 (PDT) (envelope-from eddy+public+spam@noc.everquick.net) Received: from localhost (eddy@localhost) by a.mx.everquick.net (8.10.2/8.10.2) with ESMTP id f61HFxm01978; Sun, 1 Jul 2001 17:15:59 GMT X-EverQuick-No-Abuse: Report any e-mail abuse to Date: Sun, 1 Jul 2001 17:15:59 +0000 (GMT) From: "E.B. Dreger" To: freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Resolved: AIO on SMP / locking / libc_r In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org (Responding to myself and merging two threads) > Date: Sun, 1 Jul 2001 15:52:47 +0000 (GMT) > From: E.B. Dreger > > I can't say very much about coherency problems on x86 but I can > > say for shure that you have to worry about this on every other MP > > platform including IA64. > > Even if it's a non-issue on x86, I'd rather use macros to insert proper > code on ia64, axp (if I ever port to that), and go to nothing on x86 (if > that is indeed the correct behavior). > > Looks like I need to do some digging on bus snooping, cache coherency, > read/write reordering, MTRRs, and APICs. :-) Once pointed in the right direction, a little Google searching turned up a message from three years ago today, by none other than Terry Lambert. SMP synchronization is automatic, and Intel supports full MESI (modified, exclusive, shared, and invalid) coherence. Terry also mentioned that MMAP_HASSEMAPHORE was useful on MEI architecture to flag the need for coherency. MMAP_HASSEMAPHORE is apparently a NOP flag on the x86. Pointless on x86, but mandatory on MEI architectures. Netch clarified off-list (and others on-list?) that memory IO is strictly in-order on the x86. I think that I have a handle on things now. I'll try AIO on pages with MMAP_HASSEMAPHORE set (pointless now but no-cost insurance for other architectures), and handle locking between processes as I have been. If nothing else, it gives me a handy string for which to grep in my code. ;-) Not yet sure how to handle reordering... If ia64 reorders, I guess that I'll worry about that when FreeBSD runs on ia64. :-) Thanks to all for putting up with this "SMP newbie". Hopefully this post will help summarize everything for any lurkers who were scratching their heads in the same way that I was. Thanks again to everyone for your help, Eddy --------------------------------------------------------------------------- Brotsman & Dreger, Inc. EverQuick Internet Division Phone: +1 (316) 794-8922 Wichita/(Inter)national Phone: +1 (785) 865-5885 Lawrence --------------------------------------------------------------------------- Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to , or you are likely to be blocked. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sun Jul 1 22:32:45 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id 38D9937B405 for ; Sun, 1 Jul 2001 22:32:24 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id AD1305D010; Mon, 2 Jul 2001 00:32:13 -0500 (CDT) Date: Mon, 2 Jul 2001 00:32:13 -0500 From: Alfred Perlstein To: smp@freebsd.org Subject: per cpu runqueues, cpu affinity and cpu binding. Message-ID: <20010702003213.I84523@sneakerz.org> Mime-Version: 1.0 Content-Type: multipart/mixed; boundary="2oS5YaxWCcQjTEyO" Content-Disposition: inline User-Agent: Mutt/1.2i Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: inline I've been playing around with using per-cpu run queues to achieve processor affinity as well as give the ability to bind a process to a particular cpu. Attached to this mail you'll find the diff to do this along with a program that binds processes to a cpu. There's a couple of things I want to note about this work. ) The cpu affinity seems to actually buy performance, I've seen seconds taken off user/sys time when doing kernel compiles with this. Of course if people were to provide thier own micro-benchmarks it would assist in determining the utility of this work. ) The binding is not very flexible. You can only bind to one cpu, not a group of cpus, nor can you prohibit a process from running on any particular cpu. Suggestions would be appreciated. ) It somewhat butchers the nice functional interface that Jake did because it accesses a global, namely the per-cpu queues are a global. I plan on fixing this. ) Input on how affinity/binding could be improved (along with code examples) would be appreciated. Please don't say "I would do it this way" unless your mail happens to contain an algorithm that clearly maps to some code. :) The current way it is implemented is that for unbound processes there is a double linkage, basically an unbound process will be on both the cpu it last ran on and the global queue. A certain weight is assigned to tip the scales in favor of running a process that's last ran on a particular cpu, basically 4 * RQ_PPQ (see the mod to runq_choose()), this could be adjusted in order to give either higher priority processes a boost, or a process that last ran on the cpu pulling it off the runqueue a boost. Bound processes only exist on the per-cpu queue that they are bound to. What I'd actually prefer is no global queue, when schedcpu() is called it would balance out the processes amongst the per-cpu queues, or if a particular cpu realized it was stuck with a lot of high or low priority processes while another cpu is occupied with the opposite it would attempt to migrate or steal depending on the type of imbalance going on. Suggestions on how to do this would also be appreciated. :) The attached bindcpu.c program will need sys/pioctl.h installed to compile, once compiled and the kernel is rebuilt (don't forget modules as the size of proc has changed) you can use it to bind processes like so: ./bindcpu 1 # bind curproc/pid to cpu 1 ./bindcpu -1 # unbind have fun. -- -Alfred Perlstein [alfred@freebsd.org] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="runq.diff" Index: fs/procfs/procfs_vnops.c =================================================================== RCS file: /home/ncvs/src/sys/fs/procfs/procfs_vnops.c,v retrieving revision 1.98 diff -u -r1.98 procfs_vnops.c --- fs/procfs/procfs_vnops.c 2001/05/25 16:59:04 1.98 +++ fs/procfs/procfs_vnops.c 2001/07/01 16:48:51 @@ -57,6 +57,7 @@ #include #include #include +#include #include #include #include @@ -68,6 +69,12 @@ #include +#ifdef SMP +#define NCPU_PRESENT mp_ncpus +#else +#define NCPU_PRESENT 1 +#endif + static int procfs_access __P((struct vop_access_args *)); static int procfs_badop __P((void)); static int procfs_close __P((struct vop_close_args *)); @@ -231,6 +238,7 @@ { struct pfsnode *pfs = VTOPFS(ap->a_vp); struct proc *procp, *p; + int cpu, srun; int error; int signo; struct procfs_status *psp; @@ -248,6 +256,32 @@ } switch (ap->a_command) { + case PIOCBIND: + cpu = *(int *)ap->a_data; + if (cpu < -1 || cpu >= NCPU_PRESENT) { + PROC_UNLOCK(procp); + return (EINVAL); + } + mtx_lock_spin(&sched_lock); + srun = (procp != curproc && +#ifdef SMP + procp->p_oncpu == NOCPU && /* idle */ +#endif + procp->p_stat == SRUN); + + if (srun) + remrunqueue(procp); + if (cpu == -1) { + procp->p_sflag &= ~PS_BOUND; + } else { + procp->p_sflag |= PS_BOUND; + procp->p_rqcpu = cpu; + } + if (srun) + setrunqueue(procp); + mtx_unlock_spin(&sched_lock); +printf("srun == %d, cpu == %d\n", srun, cpu); + break; case PIOCBIS: procp->p_stops |= *(unsigned int*)ap->a_data; break; Index: kern/kern_switch.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_switch.c,v retrieving revision 1.15 diff -u -r1.15 kern_switch.c --- kern/kern_switch.c 2001/03/28 09:17:54 1.15 +++ kern/kern_switch.c 2001/07/01 16:48:51 @@ -32,14 +32,18 @@ #include #include #include +#include #include #include +#include /* * Global run queue. */ static struct runq runq; -SYSINIT(runq, SI_SUB_RUN_QUEUE, SI_ORDER_FIRST, runq_init, &runq) +static struct runq *runqcpu; +SYSINIT(runq, SI_SUB_RUN_QUEUE, SI_ORDER_FIRST, runq_init, &runq); +SYSINIT(runqcpu, SI_SUB_RUN_QUEUE, SI_ORDER_FIRST, runqcpu_init, &runqcpu); /* * Wrappers which implement old interface; act on global run queue. @@ -54,12 +58,14 @@ int procrunnable(void) { - return runq_check(&runq); + + return (runq_check(&runqcpu[PCPU_GET(cpuid)]) || runq_check(&runq)); } void remrunqueue(struct proc *p) { + runq_remove(&runq, p); } @@ -154,7 +160,7 @@ runq_add(struct runq *rq, struct proc *p) { struct rqhead *rqh; - int pri; + int pri, cpu; mtx_assert(&sched_lock, MA_OWNED); KASSERT(p->p_stat == SRUN, ("runq_add: proc %p (%s) not SRUN", @@ -163,11 +169,33 @@ ("runq_add: proc %p (%s) already in run queue", p, p->p_comm)); pri = p->p_pri.pri_level / RQ_PPQ; p->p_rqindex = pri; + + if ((p->p_sflag & PS_BOUND) == 0) { + cpu = p->p_lastcpu; + if (cpu < 0 || cpu >= mp_ncpus) + cpu = PCPU_GET(cpuid); + p->p_rqcpu = cpu; + runq_setbit(rq, pri); + rqh = &rq->rq_queues[pri]; + CTR4(KTR_RUNQ, "runq_add: p=%p pri=%d %d rqh=%p", + p, p->p_pri.pri_level, pri, rqh); + TAILQ_INSERT_TAIL(rqh, p, p_procq); + } else { + CTR2(KTR_RUNQ, "runq_add: proc %p bound to cpu %d", + p, (int)p->p_rqcpu); + cpu = p->p_rqcpu; + } + + rq = &runqcpu[cpu]; + KASSERT(runq_find(rq, p) == 0, + ("runq_add: proc %p (%s) already in cpu (%d) run queue", + p, p->p_comm, cpu)); runq_setbit(rq, pri); + rqh = &rq->rq_queues[pri]; - CTR4(KTR_RUNQ, "runq_add: p=%p pri=%d %d rqh=%p", - p, p->p_pri.pri_level, pri, rqh); - TAILQ_INSERT_TAIL(rqh, p, p_procq); + CTR5(KTR_RUNQ, "runq_cpu_add: p=%p pri=%d %d rqh=%p cpu=%d", + p, p->p_pri.pri_level, pri, rqh, cpu); + TAILQ_INSERT_TAIL(rqh, p, p_proccpuq); } /* @@ -203,29 +231,53 @@ { struct rqhead *rqh; struct proc *p; - int pri; + int pri, pricpu, cpu; mtx_assert(&sched_lock, MA_OWNED); - if ((pri = runq_findbit(rq)) != -1) { + cpu = PCPU_GET(cpuid); + pricpu = runq_findbit(&runqcpu[cpu]); + pri = runq_findbit(rq); + CTR2(KTR_RUNQ, "runq_choose: pri=%d cpupri=%d", pri, pricpu); + if (pricpu != -1 && (pricpu <= pri + 4 * RQ_PPQ || pri == -1)) { + pri = pricpu; + rqh = &runqcpu[cpu].rq_queues[pri]; + } else if (pri != -1) { + rqh = &rq->rq_queues[pri]; + } else { + CTR1(KTR_RUNQ, "runq_choose: idleproc pri=%d", pri); + return (PCPU_GET(idleproc)); + } + p = TAILQ_FIRST(rqh); + KASSERT(p != NULL, ("runq_choose: no proc on busy queue")); + KASSERT(p->p_stat == SRUN, + ("runq_chose: process %d(%s) in state %d", p->p_pid, + p->p_comm, p->p_stat)); + CTR3(KTR_RUNQ, "runq_choose: pri=%d p=%p rqh=%p", pri, p, rqh); + + if ((p->p_sflag & PS_BOUND) == 0) { rqh = &rq->rq_queues[pri]; - p = TAILQ_FIRST(rqh); - KASSERT(p != NULL, ("runq_choose: no proc on busy queue")); - KASSERT(p->p_stat == SRUN, - ("runq_chose: process %d(%s) in state %d", p->p_pid, - p->p_comm, p->p_stat)); - CTR3(KTR_RUNQ, "runq_choose: pri=%d p=%p rqh=%p", pri, p, rqh); TAILQ_REMOVE(rqh, p, p_procq); if (TAILQ_EMPTY(rqh)) { CTR0(KTR_RUNQ, "runq_choose: empty"); runq_clrbit(rq, pri); } - return (p); + } else { + CTR2(KTR_RUNQ, "runq_choose: proc %p bound to cpu %d", + p, (int)p->p_rqcpu); } - CTR1(KTR_RUNQ, "runq_choose: idleproc pri=%d", pri); - - return (PCPU_GET(idleproc)); + cpu = p->p_rqcpu; + rq = &runqcpu[cpu]; + rqh = &rq->rq_queues[pri]; + TAILQ_REMOVE(rqh, p, p_proccpuq); + if (TAILQ_EMPTY(rqh)) { + CTR0(KTR_RUNQ, "runq_choose: cpu empty"); + runq_clrbit(rq, pri); + } + return (p); } +MALLOC_DEFINE(M_RUNQ, "runqueues", "Run queues"); + /* * Initialize a run structure. */ @@ -239,6 +291,19 @@ TAILQ_INIT(&rq->rq_queues[i]); } +void +runqcpu_init(struct runq **rqp) +{ + struct runq *rq; + int i; + + MALLOC(rq, struct runq *, sizeof(*runqcpu) * mp_ncpus, M_RUNQ, + M_WAITOK); + *rqp = rq; + for (i = 0; i < mp_ncpus; i++) + runq_init(&rq[i]); +} + /* * Remove the process from the queue specified by its priority, and clear the * corresponding status bit if the queue becomes empty. @@ -247,10 +312,25 @@ runq_remove(struct runq *rq, struct proc *p) { struct rqhead *rqh; + struct runq *rqcpu; int pri; mtx_assert(&sched_lock, MA_OWNED); pri = p->p_rqindex; + rqcpu = &runqcpu[p->p_rqcpu]; + rqh = &rqcpu->rq_queues[pri]; + CTR4(KTR_RUNQ, "runq_cpu_remove: p=%p pri=%d %d rqh=%p", + p, p->p_pri.pri_level, pri, rqh); + TAILQ_REMOVE(rqh, p, p_proccpuq); + if (TAILQ_EMPTY(rqh)) { + CTR0(KTR_RUNQ, "runq_cpu_remove: empty"); + runq_clrbit(rqcpu, pri); + } + if ((p->p_sflag & PS_BOUND) != 0) { + CTR2(KTR_RUNQ, "runq_cpu_remove: bound p=%p cpu=%d", + p, p->p_rqcpu); + return; + } rqh = &rq->rq_queues[pri]; CTR4(KTR_RUNQ, "runq_remove: p=%p pri=%d %d rqh=%p", p, p->p_pri.pri_level, pri, rqh); Index: sys/pioctl.h =================================================================== RCS file: /home/ncvs/src/sys/sys/pioctl.h,v retrieving revision 1.8 diff -u -r1.8 pioctl.h --- sys/pioctl.h 1999/08/28 00:51:55 1.8 +++ sys/pioctl.h 2001/07/01 16:48:51 @@ -58,6 +58,7 @@ /* Get proc status */ # define PIOCSTATUS _IOR('p', 6, struct procfs_status) # define PIOCGFL _IOR('p', 7, unsigned int) /* Get flags */ +# define PIOCBIND _IOC(IOC_IN, 'p', 8, 0) /* Bind cpu */ # define S_EXEC 0x00000001 /* stop-on-exec */ # define S_SIG 0x00000002 /* stop-on-signal */ Index: sys/proc.h =================================================================== RCS file: /home/ncvs/src/sys/sys/proc.h,v retrieving revision 1.166 diff -u -r1.166 proc.h --- sys/proc.h 2001/06/11 23:00:35 1.166 +++ sys/proc.h 2001/07/01 16:48:51 @@ -152,6 +152,7 @@ struct proc { TAILQ_ENTRY(proc) p_procq; /* (j) Run/mutex queue. */ + TAILQ_ENTRY(proc) p_proccpuq; /* (j) Run/mutex queue (per-cpu). */ TAILQ_ENTRY(proc) p_slpq; /* (j) Sleep queue. */ LIST_ENTRY(proc) p_list; /* (d) List of all processes. */ @@ -218,6 +219,7 @@ char p_lock; /* (c) Process lock (prevent swap) count. */ u_char p_oncpu; /* (j) Which cpu we are on. */ u_char p_lastcpu; /* (j) Last cpu we were on. */ + u_char p_rqcpu; /* (j) Cpu run queue we are on. */ char p_rqindex; /* (j) Run queue index. */ short p_locks; /* (*) DEBUG: lockmgr count of held locks */ @@ -329,6 +331,7 @@ #define PS_SWAPPING 0x00200 /* Process is being swapped. */ #define PS_ASTPENDING 0x00400 /* Process has a pending ast. */ #define PS_NEEDRESCHED 0x00800 /* Process needs to yield. */ +#define PS_BOUND 0x01000 /* Process is bound to a cpu */ #define P_MAGIC 0xbeefface Index: sys/runq.h =================================================================== RCS file: /home/ncvs/src/sys/sys/runq.h,v retrieving revision 1.1 diff -u -r1.1 runq.h --- sys/runq.h 2001/02/12 00:20:07 1.1 +++ sys/runq.h 2001/07/01 16:48:51 @@ -75,6 +75,7 @@ int runq_check(struct runq *); struct proc *runq_choose(struct runq *); void runq_init(struct runq *); +void runqcpu_init(struct runq **); void runq_remove(struct runq *, struct proc *); #endif --2oS5YaxWCcQjTEyO Content-Type: text/plain; charset=us-ascii Content-Disposition: attachment; filename="bindcpu.c" /* * Copyright 1997 Sean Eric Fagan * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. All advertising materials mentioning features or use of this software * must display the following acknowledgement: * This product includes software developed by Sean Eric Fagan * 4. Neither the name of the author may be used to endorse or promote * products derived from this software without specific prior written * permission. * * THIS SOFTWARE IS PROVIDED BY THE AUTHOR AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHOR OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * */ #ifndef lint static const char rcsid[] = "$FreeBSD: src/usr.sbin/procctl/procctl.c,v 1.6 2000/02/21 10:22:39 ru Exp $"; #endif /* not lint */ /* * procctl -- clear the event mask, and continue, any specified processes. * This is largely an example of how to use the procfs interface; however, * for now, it is also sometimes necessary, as a stopped process will not * otherwise continue. (This will be fixed in a later version of the * procfs code, almost certainly; however, this program will still be useful * for some annoying circumstances.) */ #include #include #include #include #include #include #include #include #include #include int main(int ac, char **av) { int fd; int i, cpu; char buf[32]; snprintf(buf, sizeof(buf), "/proc/%s/mem", av[1]); fd = open(buf, O_RDWR); cpu = atoi(av[2]); if (fd == -1) { warn("cannot open pid %s", av[1]); exit(1); } fprintf(stderr, "binding process %s to cpu %d\n", av[1], cpu); if (ioctl(fd, PIOCBIND, cpu) == -1) { warn("cannot bind process %s to cpu %d", av[1], cpu); exit(1); } close(fd); for (;;) ; return 0; } --2oS5YaxWCcQjTEyO-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 7:36:45 2001 Delivered-To: freebsd-smp@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id 1D32A37B403 for ; Mon, 2 Jul 2001 07:36:39 -0700 (PDT) (envelope-from keichii@iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 6F56159229; Mon, 2 Jul 2001 09:36:38 -0500 (CDT) Date: Mon, 2 Jul 2001 09:36:38 -0500 From: "Michael C . Wu" To: Alfred Perlstein Cc: smp@freebsd.org Subject: Re: per cpu runqueues, cpu affinity and cpu binding. Message-ID: <20010702093638.B96996@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010702003213.I84523@sneakerz.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20010702003213.I84523@sneakerz.org>; from bright@sneakerz.org on Mon, Jul 02, 2001 at 12:32:13AM -0500 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Hi Alfred, First of all, we have two different types of processor affinity. 1. user specified CPU attachment, as you have implemented. 2. system-wide transparent processor affinity, transparent to all users, which I see some work below. In SMPng, IMHO, if we can do (2) well, a lot of the problems in performance can be solved. Another problem is the widely varied application that we have. For example, on a system with many many PCI devices, (2)'s implementation will be very different from a system that is intended to run an Oracle database or a HTTP server. I don't think doing per-thread affinity is a good idea. Because we want to keep threads lightweight. You may want to take a look at this url about processor affinity: :) http://www.isi.edu/lsam/tools/autosearch/load_balancing/19970804.html On Mon, Jul 02, 2001 at 12:32:13AM -0500, Alfred Perlstein scribbled: | ) The cpu affinity seems to actually buy performance, I've seen | seconds taken off user/sys time when doing kernel compiles with | this. Of course if people were to provide thier own micro-benchmarks | it would assist in determining the utility of this work. | | ) The binding is not very flexible. You can only bind to one cpu, | not a group of cpus, nor can you prohibit a process from running | on any particular cpu. Suggestions would be appreciated. How many CPU's do we want to scale to? or how many can we? Binding to more than 3 or 4 CPU's defeats all purpose of affinity unless we have a mega >32 CPU machine. We want to hit the L2 and L3 cache, and hopefully the L1. On a AMD, PPC, Alpha, the L1 is sufficiently big to possibly retain some cache on the previous proc. On IA-64/Pentium III/Pentium IV, the L1 is so small, that worrying about the L1 makes no sense. Hence I suggest worrying about the affinity at the L2/L3 level, keeping the L1 slightly in mind. Doug Rabson and I had several lengthy conversations regarding this. Perhaps he and the others can make some input too. | ) It somewhat butchers the nice functional interface that Jake did | because it accesses a global, namely the per-cpu queues are a | global. I plan on fixing this. | | ) Input on how affinity/binding could be improved (along with code | examples) would be appreciated. Please don't say "I would do it | this way" unless your mail happens to contain an algorithm that | clearly maps to some code. :) Lots are available. Please see the URL above. When I get back to Austin and get settled, I will search for one that has worked well, since I have researched the topic in April for another reason. | The current way it is implemented is that for unbound processes | there is a double linkage, basically an unbound process will be on | both the cpu it last ran on and the global queue. A certain weight | is assigned to tip the scales in favor of running a process that's | last ran on a particular cpu, basically 4 * RQ_PPQ (see the mod to Is there a special reason for choosing 4 * RQ_PPQ? | runq_choose()), this could be adjusted in order to give either | higher priority processes a boost, or a process that last ran on | the cpu pulling it off the runqueue a boost. | | Bound processes only exist on the per-cpu queue that they are bound | to. | | What I'd actually prefer is no global queue, when schedcpu() is | called it would balance out the processes amongst the per-cpu | queues, or if a particular cpu realized it was stuck with a lot of | high or low priority processes while another cpu is occupied with | the opposite it would attempt to migrate or steal depending on the | type of imbalance going on. Suggestions on how to do this would | also be appreciated. :) An actual empirical measurement is required in this case. When can we justify the cache performance loss to switch to another CPU? In addition, once this process is switched to another CPU, we want to keep it there. | The attached bindcpu.c program will need sys/pioctl.h installed to | compile, once compiled and the kernel is rebuilt (don't forget | modules as the size of proc has changed) you can use it to bind | processes like so: | | ./bindcpu 1 # bind curproc/pid to cpu 1 | ./bindcpu -1 # unbind This interface may not be the best to do. We can figure this out later. | Index: fs/procfs/procfs_vnops.c | =================================================================== | RCS file: /home/ncvs/src/sys/fs/procfs/procfs_vnops.c,v | retrieving revision 1.98 | diff -u -r1.98 procfs_vnops.c | --- fs/procfs/procfs_vnops.c 2001/05/25 16:59:04 1.98 | +++ fs/procfs/procfs_vnops.c 2001/07/01 16:48:51 | + | + if ((p->p_sflag & PS_BOUND) == 0) { | + cpu = p->p_lastcpu; | + if (cpu < 0 || cpu >= mp_ncpus) | + cpu = PCPU_GET(cpuid); | + p->p_rqcpu = cpu; | + runq_setbit(rq, pri); | + rqh = &rq->rq_queues[pri]; | + CTR4(KTR_RUNQ, "runq_add: p=%p pri=%d %d rqh=%p", | + p, p->p_pri.pri_level, pri, rqh); | + TAILQ_INSERT_TAIL(rqh, p, p_procq); | + } else { | + CTR2(KTR_RUNQ, "runq_add: proc %p bound to cpu %d", | + p, (int)p->p_rqcpu); | + cpu = p->p_rqcpu; | + } I recall a better algorithm in the almighty TAOCP. Will look it up when I get back. | + cpu = PCPU_GET(cpuid); | + pricpu = runq_findbit(&runqcpu[cpu]); | + pri = runq_findbit(rq); | + CTR2(KTR_RUNQ, "runq_choose: pri=%d cpupri=%d", pri, pricpu); | + if (pricpu != -1 && (pricpu <= pri + 4 * RQ_PPQ || pri == -1)) { | + pri = pricpu; | + rqh = &runqcpu[cpu].rq_queues[pri]; | + } else if (pri != -1) { | + rqh = &rq->rq_queues[pri]; | + } else { | + CTR1(KTR_RUNQ, "runq_choose: idleproc pri=%d", pri); | + return (PCPU_GET(idleproc)); | + } Do you intend the algorithm to be this simple? Or are you going to change it in the future? Thank you, Michael -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 9:13: 6 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id D35A337B405 for ; Mon, 2 Jul 2001 09:13:01 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id 5FA5A5D010; Mon, 2 Jul 2001 11:12:51 -0500 (CDT) Date: Mon, 2 Jul 2001 11:12:51 -0500 From: Alfred Perlstein To: "Michael C . Wu" Cc: smp@freebsd.org Subject: Re: per cpu runqueues, cpu affinity and cpu binding. Message-ID: <20010702111251.L84523@sneakerz.org> References: <20010702003213.I84523@sneakerz.org> <20010702093638.B96996@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <20010702093638.B96996@peorth.iteration.net>; from keichii@iteration.net on Mon, Jul 02, 2001 at 09:36:38AM -0500 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * Michael C . Wu [010702 09:36] wrote: > > I don't think doing per-thread affinity is a good idea. Because > we want to keep threads lightweight. That's true, one can also optimize certain operations such as tlb shootdown when all threads of a process are on a single cpu. > You may want to take a look at this url about processor affinity: :) > http://www.isi.edu/lsam/tools/autosearch/load_balancing/19970804.html Sorry, there's a boatload of urls on that page, i've choosen a couple of them and the info isn't very useful (old archived emails), perhaps you can give a more direct link to this information? > | The current way it is implemented is that for unbound processes > | there is a double linkage, basically an unbound process will be on > | both the cpu it last ran on and the global queue. A certain weight > | is assigned to tip the scales in favor of running a process that's > | last ran on a particular cpu, basically 4 * RQ_PPQ (see the mod to > > Is there a special reason for choosing 4 * RQ_PPQ? Yes, I thought it was a good value. :) > | runq_choose()), this could be adjusted in order to give either > | higher priority processes a boost, or a process that last ran on > | the cpu pulling it off the runqueue a boost. > | > | Bound processes only exist on the per-cpu queue that they are bound > | to. > | > | What I'd actually prefer is no global queue, when schedcpu() is > | called it would balance out the processes amongst the per-cpu > | queues, or if a particular cpu realized it was stuck with a lot of > | high or low priority processes while another cpu is occupied with > | the opposite it would attempt to migrate or steal depending on the > | type of imbalance going on. Suggestions on how to do this would > | also be appreciated. :) > > An actual empirical measurement is required in this case. > When can we justify the cache performance loss to switch to another > CPU? In addition, once this process is switched to another CPU, > we want to keep it there. That's the intention of the redistribution code, if it only happens every N times schedcpu() is called, for the time between calls we will have hard affinity and possibly loose that once, twice, HZ times a second? Anyhow, I really would like to hear about stuff that maps to code for this one. > | The attached bindcpu.c program will need sys/pioctl.h installed to > | compile, once compiled and the kernel is rebuilt (don't forget > | modules as the size of proc has changed) you can use it to bind > | processes like so: > | > | ./bindcpu 1 # bind curproc/pid to cpu 1 > | ./bindcpu -1 # unbind > > This interface may not be the best to do. We can figure this out later. I think it's sorta cute, Mike Smith suggested something along this line. > | Index: fs/procfs/procfs_vnops.c > | =================================================================== > | RCS file: /home/ncvs/src/sys/fs/procfs/procfs_vnops.c,v > | retrieving revision 1.98 > | diff -u -r1.98 procfs_vnops.c > | --- fs/procfs/procfs_vnops.c 2001/05/25 16:59:04 1.98 > | +++ fs/procfs/procfs_vnops.c 2001/07/01 16:48:51 > > | + > | + if ((p->p_sflag & PS_BOUND) == 0) { > | + cpu = p->p_lastcpu; > | + if (cpu < 0 || cpu >= mp_ncpus) > | + cpu = PCPU_GET(cpuid); > | + p->p_rqcpu = cpu; > | + runq_setbit(rq, pri); > | + rqh = &rq->rq_queues[pri]; > | + CTR4(KTR_RUNQ, "runq_add: p=%p pri=%d %d rqh=%p", > | + p, p->p_pri.pri_level, pri, rqh); > | + TAILQ_INSERT_TAIL(rqh, p, p_procq); > | + } else { > | + CTR2(KTR_RUNQ, "runq_add: proc %p bound to cpu %d", > | + p, (int)p->p_rqcpu); > | + cpu = p->p_rqcpu; > | + } > > I recall a better algorithm in the almighty TAOCP. Will look > it up when I get back. > > | + cpu = PCPU_GET(cpuid); > | + pricpu = runq_findbit(&runqcpu[cpu]); > | + pri = runq_findbit(rq); > | + CTR2(KTR_RUNQ, "runq_choose: pri=%d cpupri=%d", pri, pricpu); > | + if (pricpu != -1 && (pricpu <= pri + 4 * RQ_PPQ || pri == -1)) { > | + pri = pricpu; > | + rqh = &runqcpu[cpu].rq_queues[pri]; > | + } else if (pri != -1) { > | + rqh = &rq->rq_queues[pri]; > | + } else { > | + CTR1(KTR_RUNQ, "runq_choose: idleproc pri=%d", pri); > | + return (PCPU_GET(idleproc)); > | + } > > Do you intend the algorithm to be this simple? Or are you going to > change it in the future? Ah! This is important, I would like it to be more complex in the future if it buys performance, but you should know that the KTR stuff when compiled in makes a signifigant performance degradation occur, the KTR stuff is (or should) be pretty fast, basically, we don't have much room here for a very complex algorithm, this is why I wanted to move it to schedcpu(). -- -Alfred Perlstein [alfred@freebsd.org] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 9:43: 7 2001 Delivered-To: freebsd-smp@freebsd.org Received: from a.mx.everquick.net (a.mx.everquick.net [216.89.137.3]) by hub.freebsd.org (Postfix) with ESMTP id 2843F37B401 for ; Mon, 2 Jul 2001 09:43:02 -0700 (PDT) (envelope-from eddy+public+spam@noc.everquick.net) Received: from localhost (eddy@localhost) by a.mx.everquick.net (8.10.2/8.10.2) with ESMTP id f62Ggqc14662; Mon, 2 Jul 2001 16:42:52 GMT X-EverQuick-No-Abuse: Report any e-mail abuse to Date: Mon, 2 Jul 2001 16:42:51 +0000 (GMT) From: "E.B. Dreger" To: "Michael C . Wu" Cc: Alfred Perlstein , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: <20010702093638.B96996@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org (thoughts from the sidelines) > Date: Mon, 2 Jul 2001 09:36:38 -0500 > From: Michael C . Wu > First of all, we have two different types of processor affinity. > 1. user specified CPU attachment, as you have implemented. > 2. system-wide transparent processor affinity, transparent > to all users, which I see some work below. > > In SMPng, IMHO, if we can do (2) well, a lot of the problems > in performance can be solved. Not just keeping a given process on the same CPU... but what about a "process type"? i.e., if different processes have the same ELF header, run them _all_ on the CPU _unless_ it leaves another CPU excessively idle. Why waste [code] cache on multiple processors when you can keep things on one? > Another problem is the widely varied application that we have. > For example, on a system with many many PCI devices, (2)'s implementation > will be very different from a system that is intended to run > an Oracle database or a HTTP server. Could you please elaborate? > I don't think doing per-thread affinity is a good idea. Because > we want to keep threads lightweight. !!! > You may want to take a look at this url about processor affinity: :) > http://www.isi.edu/lsam/tools/autosearch/load_balancing/19970804.html So many of those links are 404. :-( > An actual empirical measurement is required in this case. > When can we justify the cache performance loss to switch to another > CPU? In addition, once this process is switched to another CPU, > we want to keep it there. Unless two processes are running on CPU #1, and CPU #2 becomes idle. Then switching a process to CPU #2 makes sense... unless the process getting switched is "close" to completion. I'll probably get flamed for suggesting something so ugly, but should we assume that non-daemon processes are short-running, and be more resistant to switching CPUs on those? Eddy --------------------------------------------------------------------------- Brotsman & Dreger, Inc. EverQuick Internet Division Phone: +1 (316) 794-8922 Wichita/(Inter)national Phone: +1 (785) 865-5885 Lawrence --------------------------------------------------------------------------- Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to , or you are likely to be blocked. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 9:50:48 2001 Delivered-To: freebsd-smp@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id D196E37B403 for ; Mon, 2 Jul 2001 09:50:44 -0700 (PDT) (envelope-from keichii@iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 58F6A59229; Mon, 2 Jul 2001 11:50:44 -0500 (CDT) Date: Mon, 2 Jul 2001 11:50:44 -0500 From: "Michael C . Wu" To: "E.B. Dreger" Cc: Alfred Perlstein , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. Message-ID: <20010702115044.C99436@peorth.iteration.net> Reply-To: "Michael C . Wu" References: <20010702093638.B96996@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from eddy+public+spam@noc.everquick.net on Mon, Jul 02, 2001 at 04:42:51PM +0000 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, Jul 02, 2001 at 04:42:51PM +0000, E.B. Dreger scribbled: | (thoughts from the sidelines) | | > Date: Mon, 2 Jul 2001 09:36:38 -0500 | > From: Michael C . Wu | | > First of all, we have two different types of processor affinity. | > 1. user specified CPU attachment, as you have implemented. | > 2. system-wide transparent processor affinity, transparent | > to all users, which I see some work below. | > | > In SMPng, IMHO, if we can do (2) well, a lot of the problems | > in performance can be solved. | | Not just keeping a given process on the same CPU... but what about a | "process type"? i.e., if different processes have the same ELF header, | run them _all_ on the CPU _unless_ it leaves another CPU excessively idle. | | Why waste [code] cache on multiple processors when you can keep things on | one? Because it is very difficult to worry about these things. And the performance gain might probably be less than the overhead of comparing the headers. | > Another problem is the widely varied application that we have. | > For example, on a system with many many PCI devices, (2)'s implementation | > will be very different from a system that is intended to run | > an Oracle database or a HTTP server. | | Could you please elaborate? Different situations require completely different things. Sometimes a router will have many interrupts for ether device management. And sometimes we have single purpose servers that only does one thing. | > I don't think doing per-thread affinity is a good idea. Because | > we want to keep threads lightweight. | | !!! Please elaborate. I don't understand what three exclamation marks are supposed to mean. | > You may want to take a look at this url about processor affinity: :) | > http://www.isi.edu/lsam/tools/autosearch/load_balancing/19970804.html | | So many of those links are 404. :-( | | > An actual empirical measurement is required in this case. | > When can we justify the cache performance loss to switch to another | > CPU? In addition, once this process is switched to another CPU, | > we want to keep it there. | | Unless two processes are running on CPU #1, and CPU #2 becomes idle. | Then switching a process to CPU #2 makes sense... unless the process | getting switched is "close" to completion. Please read my post again, I think I explained the idea that L1 will be busted very quickly. -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 10:13:54 2001 Delivered-To: freebsd-smp@freebsd.org Received: from a.mx.everquick.net (a.mx.everquick.net [216.89.137.3]) by hub.freebsd.org (Postfix) with ESMTP id 5480337B408 for ; Mon, 2 Jul 2001 10:13:50 -0700 (PDT) (envelope-from eddy+public+spam@noc.everquick.net) Received: from localhost (eddy@localhost) by a.mx.everquick.net (8.10.2/8.10.2) with ESMTP id f62HDmG15128; Mon, 2 Jul 2001 17:13:48 GMT X-EverQuick-No-Abuse: Report any e-mail abuse to Date: Mon, 2 Jul 2001 17:13:47 +0000 (GMT) From: "E.B. Dreger" To: "Michael C . Wu" Cc: Alfred Perlstein , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: <20010702115044.C99436@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > Date: Mon, 2 Jul 2001 11:50:44 -0500 > From: Michael C . Wu > | Not just keeping a given process on the same CPU... but what about a > | "process type"? i.e., if different processes have the same ELF > | header, run them _all_ on the CPU _unless_ it leaves another CPU > | excessively idle. > | > | Why waste [code] cache on multiple processors when you can keep things > | on one? > > Because it is very difficult to worry about these things. And the > performance gain might probably be less than the overhead of comparing > the headers. ELF headers was just one example; I'd have to look at the format to get a more specific idea. There are probably other ways to compute a quick, "unique enough" hash. > Different situations require completely different things. > Sometimes a router will have many interrupts for ether device > management. And sometimes we have single purpose servers that only does > one thing. Of course. But the principles are more or less the same... we have multiple processes that we need to distribute on multiple CPUs. What changes is how hard we should resist switching. > | > I don't think doing per-thread affinity is a good idea. Because > | > we want to keep threads lightweight. > | > | !!! > > Please elaborate. I don't understand what three exclamation marks > are supposed to mean. Definitely want lightweight threads. > | Unless two processes are running on CPU #1, and CPU #2 becomes idle. > | Then switching a process to CPU #2 makes sense... unless the process > | getting switched is "close" to completion. > > Please read my post again, I think I explained the idea that > L1 will be busted very quickly. Yes, it will. My [oversimplified as it was] point was that there are times where it's better to wipe out even the L2 cache than it is to have an underutilized processor. Eddy --------------------------------------------------------------------------- Brotsman & Dreger, Inc. EverQuick Internet Division Phone: +1 (316) 794-8922 Wichita/(Inter)national Phone: +1 (785) 865-5885 Lawrence --------------------------------------------------------------------------- Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to , or you are likely to be blocked. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 11:38:47 2001 Delivered-To: freebsd-smp@freebsd.org Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9]) by hub.freebsd.org (Postfix) with ESMTP id CAA3537B403 for ; Mon, 2 Jul 2001 11:38:43 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org (InterJet.elischer.org [192.168.1.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id NAA13383; Mon, 2 Jul 2001 13:19:06 -0700 (PDT) Date: Mon, 2 Jul 2001 13:19:04 -0700 (PDT) From: Julian Elischer To: "Michael C . Wu" Cc: "E.B. Dreger" , Alfred Perlstein , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: <20010702115044.C99436@peorth.iteration.net> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 2 Jul 2001, Michael C . Wu wrote: > On Mon, Jul 02, 2001 at 04:42:51PM +0000, E.B. Dreger scribbled: > | > | Not just keeping a given process on the same CPU... but what about a > | "process type"? i.e., if different processes have the same ELF header, > | run them _all_ on the CPU _unless_ it leaves another CPU excessively idle. > | > | Why waste [code] cache on multiple processors when you can keep things on > | one? > > Because it is very difficult to worry about these things. And the performance > gain might probably be less than the overhead of comparing the headers. One could note a prefered processor in the vnode of the executable exec() could take note of this and switch processors if the load balancing algorythm showed that the new processor would not become overly stresssed at the idea. (do all /bin/sh run on the sme processor?) or maybe the vnode can note which processors it is already on (more than one). > > | > I don't think doing per-thread affinity is a good idea. Because > | > we want to keep threads lightweight. > | > | !!! > > Please elaborate. I don't understand what three exclamation marks > are supposed to mean. At USENIX we decided to proceed with the KSE work. I have already re-implemented the proc-splitting patches from January and have split the proc structure into parts to support threads. In this case teh processor affinity stuff that alfred has done are already in a per-thread (per kse) basis. Individual threads may migrate between KSEs but if teh program acts to implement KSEs (thread carriers) on multiple processors then they will try STAY on particular processors. As a side issue I plan on NOT ALLOWING multiple KSEs (thread carriers?) from the same thread group in the same process to be on the same processor. SO load balancing and processor affinity will not apply to the thread-carrying entities (KSEs). Of course the userland thread scheduler has the ultimate say as to which processor a thread is scheduled on. > julian To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 12:11:16 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id A21C537B406 for ; Mon, 2 Jul 2001 12:11:13 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id 5000E5D010; Mon, 2 Jul 2001 14:11:13 -0500 (CDT) Date: Mon, 2 Jul 2001 14:11:13 -0500 From: Alfred Perlstein To: Julian Elischer Cc: "Michael C . Wu" , "E.B. Dreger" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. Message-ID: <20010702141113.Q84523@sneakerz.org> References: <20010702115044.C99436@peorth.iteration.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: ; from julian@elischer.org on Mon, Jul 02, 2001 at 01:19:04PM -0700 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * Julian Elischer [010702 13:38] wrote: > > At USENIX we decided to proceed with the KSE work. yay! > I have already re-implemented the proc-splitting patches from January and > have split the proc structure into parts to support threads. In this case > teh processor affinity stuff that alfred has done are already in a > per-thread (per kse) basis. Individual threads may migrate between KSEs > but if teh program acts to implement KSEs (thread carriers) on multiple > processors then they will try STAY on particular processors. > > As a side issue I plan on NOT ALLOWING multiple KSEs (thread carriers?) > from the same thread group in the same process to be on the same > processor. SO load balancing and processor affinity will not > apply to the thread-carrying entities (KSEs). Of course the userland > thread scheduler has the ultimate say as to which processor > a thread is scheduled on. Actually, this may cause some performance problems, when you have a shared address space you can avoid tlb shootdowns when a process's address space changes, you also share the cache, lastly there's some rumor about a new CPU archetecture that runs multple threads on the same CPU at the same time. Just food for thought. -- -Alfred Perlstein [alfred@freebsd.org] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 12:58:48 2001 Delivered-To: freebsd-smp@freebsd.org Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9]) by hub.freebsd.org (Postfix) with ESMTP id 61A5037B406 for ; Mon, 2 Jul 2001 12:58:41 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org (InterJet.elischer.org [192.168.1.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id OAA13624; Mon, 2 Jul 2001 14:24:18 -0700 (PDT) Date: Mon, 2 Jul 2001 14:24:17 -0700 (PDT) From: Julian Elischer To: Alfred Perlstein Cc: "Michael C . Wu" , "E.B. Dreger" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: <20010702141113.Q84523@sneakerz.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 2 Jul 2001, Alfred Perlstein wrote: > * Julian Elischer [010702 13:38] wrote: > > > > At USENIX we decided to proceed with the KSE work. > > yay! > > > > > As a side issue I plan on NOT ALLOWING multiple KSEs (thread carriers?) > > from the same thread group in the same process to be on the same > > processor. SO load balancing and processor affinity will not > > apply to the thread-carrying entities (KSEs). Of course the userland > > thread scheduler has the ultimate say as to which processor > > a thread is scheduled on. > > Actually, this may cause some performance problems, when you have > a shared address space you can avoid tlb shootdowns when a process's > address space changes, you also share the cache, lastly there's > some rumor about a new CPU archetecture that runs multple threads > on the same CPU at the same time. Just food for thought. If you select to run 2 thread carriers (see other mail on nomenclature)> (KSEs) then you have specifically asked for 2 processors worth of concurrency so we ASSUME you know what you are doing.. If you want to run all the threads on a single processor to get better cache activity, then you should't ASK to run on 2 (or more) processors. :-) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13: 2:33 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id 11A3A37B405 for ; Mon, 2 Jul 2001 13:02:29 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id 84CFE5D01F; Mon, 2 Jul 2001 15:02:28 -0500 (CDT) Date: Mon, 2 Jul 2001 15:02:28 -0500 From: Alfred Perlstein To: Julian Elischer Cc: "Michael C . Wu" , "E.B. Dreger" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. Message-ID: <20010702150228.S84523@sneakerz.org> References: <20010702141113.Q84523@sneakerz.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: ; from julian@elischer.org on Mon, Jul 02, 2001 at 02:24:17PM -0700 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * Julian Elischer [010702 14:58] wrote: > > > On Mon, 2 Jul 2001, Alfred Perlstein wrote: > > > * Julian Elischer [010702 13:38] wrote: > > > > > > At USENIX we decided to proceed with the KSE work. > > > > yay! > > > > > > > > As a side issue I plan on NOT ALLOWING multiple KSEs (thread carriers?) > > > from the same thread group in the same process to be on the same > > > processor. SO load balancing and processor affinity will not > > > apply to the thread-carrying entities (KSEs). Of course the userland > > > thread scheduler has the ultimate say as to which processor > > > a thread is scheduled on. > > > > Actually, this may cause some performance problems, when you have > > a shared address space you can avoid tlb shootdowns when a process's > > address space changes, you also share the cache, lastly there's > > some rumor about a new CPU archetecture that runs multple threads > > on the same CPU at the same time. Just food for thought. > > If you select to run 2 thread carriers (see other mail on nomenclature)> > (KSEs) then you have specifically asked for 2 processors worth of > concurrency so we ASSUME you know what you are doing.. If you want to run > all the threads on a single processor to get better cache activity, then > you should't ASK to run on 2 (or more) processors. Agreed, however don't forget about the multiple thread execution units that may become available, meaning that as long as you share an address space you can run two (or more) threads in parrallel on a single processor. You wouldn't want to preclude us of taking advantage of that if it becomes available. -- -Alfred Perlstein [alfred@freebsd.org] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13: 8: 9 2001 Delivered-To: freebsd-smp@freebsd.org Received: from a.mx.everquick.net (a.mx.everquick.net [216.89.137.3]) by hub.freebsd.org (Postfix) with ESMTP id 0781637B401 for ; Mon, 2 Jul 2001 13:08:04 -0700 (PDT) (envelope-from eddy+public+spam@noc.everquick.net) Received: from localhost (eddy@localhost) by a.mx.everquick.net (8.10.2/8.10.2) with ESMTP id f62K7xJ18274; Mon, 2 Jul 2001 20:07:59 GMT X-EverQuick-No-Abuse: Report any e-mail abuse to Date: Mon, 2 Jul 2001 20:07:58 +0000 (GMT) From: "E.B. Dreger" To: Alfred Perlstein Cc: Julian Elischer , "Michael C . Wu" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: <20010702141113.Q84523@sneakerz.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > Date: Mon, 2 Jul 2001 14:11:13 -0500 > From: Alfred Perlstein > > As a side issue I plan on NOT ALLOWING multiple KSEs (thread > > carriers?) from the same thread group in the same process to be on the > > same processor. SO load balancing and processor affinity will not > > apply to the thread-carrying entities (KSEs). Of course the userland Why force things? Again, going back to affinity hinting... if the hint is a composite hash including a process-specified value, that would allow a process to say, "Hey, please run these on different CPUs". Likewise, a process could say, "Please run these on the same CPU" for different threads that share much code. > > thread scheduler has the ultimate say as to which processor Not sure that I like this. It would have to be runtime-tunable or modulo real number of processors. Then everything wants CPU #0... ick. > > a thread is scheduled on. > > Actually, this may cause some performance problems, when you have > a shared address space you can avoid tlb shootdowns when a process's > address space changes, you also share the cache, lastly there's Example: task #1 Main program, serving Web requests, processing mail, handling database queries, whatever. Needs something {en|de}crypted, so it initiates an AIO read for the {de|en}crypted data. task #2 Performs the {en|de}cryption, then sends process #1 a pointer to shared memory containing the result. In this case, one would want the ability to flag that the processes should run on different CPUs. Different critical paths with different code. With some of the puny L2 caches nowadays, a task switch on a single processor might wipe out L2... > some rumor about a new CPU archetecture that runs multple threads > on the same CPU at the same time. Just food for thought. You mean ia64's "explicit parallelism" (EPIC)? http://www.utc.edu/~jdumas/cs460/papers2000/itanium/Itanium.htm I just found this via Google, and have only skimmed it... i.e., I can't comment on the new instruction set, but thought that I'd throw that in. I _do_ know, however, that the P6 family could be much faster with better decode and execute units. (Anybody who has ever tuned assembler for the family knows what I mean...) Eddy --------------------------------------------------------------------------- Brotsman & Dreger, Inc. EverQuick Internet Division Phone: +1 (316) 794-8922 Wichita/(Inter)national Phone: +1 (785) 865-5885 Lawrence --------------------------------------------------------------------------- Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to , or you are likely to be blocked. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13: 9:18 2001 Delivered-To: freebsd-smp@freebsd.org Received: from klima.physik.uni-mainz.de (klima.Physik.Uni-Mainz.DE [134.93.180.162]) by hub.freebsd.org (Postfix) with ESMTP id 24FB837B406; Mon, 2 Jul 2001 13:09:08 -0700 (PDT) (envelope-from ohartman@klima.physik.uni-mainz.de) Received: from klima.Physik.Uni-Mainz.DE (Sturm@klima.Physik.Uni-Mainz.DE [134.93.180.162]) by klima.physik.uni-mainz.de (8.11.4/8.11.4) with ESMTP id f62K97U00663; Mon, 2 Jul 2001 22:09:07 +0200 (CEST) (envelope-from ohartman@klima.physik.uni-mainz.de) Date: Mon, 2 Jul 2001 22:09:07 +0200 (CEST) From: "Hartmann, O." To: Cc: Subject: FBSD 4.3-STABLE freezes on SMP boxes!!!!! Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org As I posted before and read in several messages before FreeBSD 4.3-STABLE with cvsupdate since Friday 29th June freezes up and hungs!!! This happened since Friday now for three times. I read about a solution in increasing MAXDSIZ="(512*1024*1024)" and DFLDSIZ="(512*1024*1024)" in the kernel up to the real installed amount of memory, but this does not solve the problem. I increased it on our main system up to 2048 and on another up to 1024 as this reflects the installed memory. The main NFS server locked up again while the other machine still works. I decreased this value on all machines to 512. I do not know what's going on but it is really serious! All machines with SMP kernel run stable before Friday and with the last cvsupdate both SMP machines locked up in the night. Only pushing the reset button brought them back into live. Another phenomenon occured the same time regards the Linuxator. We use here Lahey Fortran F95 for Linux successfuly. Since today's cvsupdate on one of the SMP machines it core dumps with signal 11 (jwd_fort). On the other SMP machine it runs clear. Only the above mentioned kernel options has been changed ... -- MfG O. Hartmann ohartman@klima.physik.uni-mainz.de ---------------------------------------------------------------- IT-Administration des Institut fuer Physik der Atmosphaere (IPA) ---------------------------------------------------------------- Johannes Gutenberg Universitaet Mainz Becherweg 21 55099 Mainz Tel: +496131/3924662 (Maschinenraum) Tel: +496131/3924144 FAX: +496131/3923532 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13:18:54 2001 Delivered-To: freebsd-smp@freebsd.org Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9]) by hub.freebsd.org (Postfix) with ESMTP id 419A237B406 for ; Mon, 2 Jul 2001 13:18:47 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org (InterJet.elischer.org [192.168.1.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id OAA13760; Mon, 2 Jul 2001 14:51:53 -0700 (PDT) Date: Mon, 2 Jul 2001 14:51:52 -0700 (PDT) From: Julian Elischer To: Alfred Perlstein Cc: "Michael C . Wu" , "E.B. Dreger" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: <20010702150228.S84523@sneakerz.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 2 Jul 2001, Alfred Perlstein wrote: > * Julian Elischer [010702 14:58] wrote: > > > > If you select to run 2 thread carriers (see other mail on nomenclature)> > > (KSEs) then you have specifically asked for 2 processors worth of > > concurrency so we ASSUME you know what you are doing.. If you want to run > > all the threads on a single processor to get better cache activity, then > > you should't ASK to run on 2 (or more) processors. > > Agreed, however don't forget about the multiple thread execution > units that may become available, meaning that as long as you share > an address space you can run two (or more) threads in parrallel on > a single processor. You wouldn't want to preclude us of taking > advantage of that if it becomes available. If that architecture takes off (I have my doubts.. ALPHA was the only one trying that), then we can change the rules about only allowing one thread container per processor (and limit it to the number of thread execution units). To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13:33:54 2001 Delivered-To: freebsd-smp@freebsd.org Received: from grace.speakeasy.org (grace.speakeasy.org [216.254.0.2]) by hub.freebsd.org (Postfix) with SMTP id 94A3937B403 for ; Mon, 2 Jul 2001 13:33:51 -0700 (PDT) (envelope-from seanj@speakeasy.org) Received: (qmail 24960 invoked by uid 6969); 2 Jul 2001 20:32:26 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 2 Jul 2001 20:32:26 -0000 Date: Mon, 2 Jul 2001 13:32:26 -0700 (PDT) From: seanj To: Julian Elischer Cc: Alfred Perlstein , "Michael C . Wu" , "E.B. Dreger" , "smp@FreeBSD.ORG" Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org http://www.cs.washington.edu/research/smt/ http://www.cs.washington.edu/research/smt/papers/ieee_micro.pdf http://www.bearcave.com/software/java/comp_links.html Supposedly an upcomping XEON processor will have SMT / ILP. Not to prognosticate but I think having TEUs (thread execution units) will be a very good idea (tm). What about architectures where the CPUs might share the same L2/L3 cache? Multichip modules or multiple on die cpus? IBM Power4? Two procs, one L2. http://www.eetimes.com/story/OEG19990804S0023 http://arstechnica.com/cpu/4q99/majc/majc-1.html This is very probably most likely sorta in our near future. Sean. On Mon, 2 Jul 2001, Julian Elischer wrote: > > > On Mon, 2 Jul 2001, Alfred Perlstein wrote: > > > * Julian Elischer [010702 14:58] wrote: > > > > > > If you select to run 2 thread carriers (see other mail on nomenclature)> > > > (KSEs) then you have specifically asked for 2 processors worth of > > > concurrency so we ASSUME you know what you are doing.. If you want to run > > > all the threads on a single processor to get better cache activity, then > > > you should't ASK to run on 2 (or more) processors. > > > > Agreed, however don't forget about the multiple thread execution > > units that may become available, meaning that as long as you share > > an address space you can run two (or more) threads in parrallel on > > a single processor. You wouldn't want to preclude us of taking > > advantage of that if it becomes available. > > If that architecture takes off (I have my doubts.. ALPHA was the only one > trying that), then we can change the rules about only allowing one thread > container per processor (and limit it to the number of thread execution > units). > > > > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-smp" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13:38:26 2001 Delivered-To: freebsd-smp@freebsd.org Received: from mail.fpsn.net (mail.fpsn.net [63.224.69.57]) by hub.freebsd.org (Postfix) with ESMTP id A400B37B401; Mon, 2 Jul 2001 13:38:18 -0700 (PDT) (envelope-from cfaber@fpsn.net) Received: from fpsn.net (control.fpsn.net [63.224.69.60]) by mail.fpsn.net (8.9.3/8.9.3) with ESMTP id OAA49623; Mon, 2 Jul 2001 14:38:11 -0600 (MDT) (envelope-from cfaber@fpsn.net) Message-ID: <3B40DB91.C8DEFF92@fpsn.net> Date: Mon, 02 Jul 2001 14:37:37 -0600 From: Colin Faber Reply-To: cfaber@fpsn.net Organization: fpsn.net, Inc. X-Mailer: Mozilla 4.75 [en] (Windows NT 5.0; U) X-Accept-Language: en MIME-Version: 1.0 To: "Hartmann, O." Cc: freebsd-smp@FreeBSD.ORG, freebsd-stable@FreeBSD.ORG Subject: Re: FBSD 4.3-STABLE freezes on SMP boxes!!!!! References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org No problems here on my Compaq proliant 800 with dual 500MHz pentium III's "Hartmann, O." wrote: > > As I posted before and read in several messages before > FreeBSD 4.3-STABLE with cvsupdate since Friday 29th June freezes up > and hungs!!! > > This happened since Friday now for three times. > I read about a solution in increasing MAXDSIZ="(512*1024*1024)" > and DFLDSIZ="(512*1024*1024)" in the kernel up to the real installed > amount of memory, but this does not solve the problem. I increased > it on our main system up to 2048 and on another up to 1024 as this > reflects the installed memory. The main NFS server locked up again while the > other machine still works. I decreased this value on all machines to 512. > > I do not know what's going on but it is really serious! All machines with > SMP kernel run stable before Friday and with the last cvsupdate both SMP > machines locked up in the night. Only pushing the reset button brought > them back into live. > > Another phenomenon occured the same time regards the Linuxator. We use here > Lahey Fortran F95 for Linux successfuly. Since today's cvsupdate on one > of the SMP machines it core dumps with signal 11 (jwd_fort). On the other > SMP machine it runs clear. Only the above mentioned kernel options has been > changed ... > > -- > MfG > O. Hartmann > > ohartman@klima.physik.uni-mainz.de > ---------------------------------------------------------------- > IT-Administration des Institut fuer Physik der Atmosphaere (IPA) > ---------------------------------------------------------------- > Johannes Gutenberg Universitaet Mainz > Becherweg 21 > 55099 Mainz > > Tel: +496131/3924662 (Maschinenraum) > Tel: +496131/3924144 > FAX: +496131/3923532 > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-smp" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 13:38:42 2001 Delivered-To: freebsd-smp@freebsd.org Received: from InterJet.elischer.org (c421509-a.pinol1.sfba.home.com [24.7.86.9]) by hub.freebsd.org (Postfix) with ESMTP id F3A5537B406 for ; Mon, 2 Jul 2001 13:38:35 -0700 (PDT) (envelope-from julian@elischer.org) Received: from InterJet.elischer.org (InterJet.elischer.org [192.168.1.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id PAA13853; Mon, 2 Jul 2001 15:03:21 -0700 (PDT) Date: Mon, 2 Jul 2001 15:03:20 -0700 (PDT) From: Julian Elischer To: "E.B. Dreger" Cc: Alfred Perlstein , "Michael C . Wu" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 2 Jul 2001, E.B. Dreger wrote: > > Date: Mon, 2 Jul 2001 14:11:13 -0500 > > From: Alfred Perlstein > > > > As a side issue I plan on NOT ALLOWING multiple KSEs (thread > > > carriers?) from the same thread group in the same process to be on the > > > same processor. SO load balancing and processor affinity will not > > > apply to the thread-carrying entities (KSEs). Of course the userland > > Why force things? Again, going back to affinity hinting... if the hint > is a composite hash including a process-specified value, that would allow > a process to say, "Hey, please run these on different CPUs". There is ABSOLUTLY NO GAIN in allowing it. it only makes things more difficult to track. They would steal each other's quantum, meaning that you would not get more CPU time, and since the system calls are all async in this system you don't get any more concurrency either. There are different mechanisms (struct #2) to allow you to compete in the scheduler with greater weight. > > Likewise, a process could say, "Please run these on the same CPU" for > different threads that share much code. Well then ask the thrtead scheduler to run them on the same CPU (#3) > > > > thread scheduler has the ultimate say as to which processor > > Not sure that I like this. It would have to be runtime-tunable or modulo > real number of processors. Then everything wants CPU #0... ick. No everyhing runs on VIRTUAL CPUs. the mapping between virtual snd real CPU's is by default not fixed. Your CPU#0 may not be the same as MY CPU #0. > > > > a thread is scheduled on. > > > > Actually, this may cause some performance problems, when you have > > a shared address space you can avoid tlb shootdowns when a process's > > address space changes, you also share the cache, lastly there's > > Example: > > task #1 Main program, serving Web requests, processing mail, handling > database queries, whatever. Needs something {en|de}crypted, so > it initiates an AIO read for the {de|en}crypted data. > > task #2 Performs the {en|de}cryption, then sends process #1 a pointer to > shared memory containing the result. > > In this case, one would want the ability to flag that the processes should > run on different CPUs. Different critical paths with different code. > With some of the puny L2 caches nowadays, a task switch on a single > processor might wipe out L2... 1/ All reads that block will be async (AIO by defualt) 2/ WHich processor #2 runs on depends on a) where it ran last b) whether there is a free processor when it get's scheduled, and where it is. > > > some rumor about a new CPU archetecture that runs multple threads > > on the same CPU at the same time. Just food for thought. > > You mean ia64's "explicit parallelism" (EPIC)? No "Multiple thread execution units", Separate register files shared ALUs etc... ALpha was going to try do it.. > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 14: 9:45 2001 Delivered-To: freebsd-smp@freebsd.org Received: from mail.rpi.edu (mail.rpi.edu [128.113.22.40]) by hub.freebsd.org (Postfix) with ESMTP id 7B92337B407; Mon, 2 Jul 2001 14:09:36 -0700 (PDT) (envelope-from drosih@rpi.edu) Received: from [128.113.24.47] (gilead.acs.rpi.edu [128.113.24.47]) by mail.rpi.edu (8.11.3/8.11.3) with ESMTP id f62L9YY37664; Mon, 2 Jul 2001 17:09:34 -0400 Mime-Version: 1.0 X-Sender: drosih@mail.rpi.edu Message-Id: In-Reply-To: References: Date: Mon, 2 Jul 2001 17:09:32 -0400 To: "Hartmann, O." From: Garance A Drosihn Subject: Re: FBSD 4.3-STABLE freezes on SMP boxes!!!!! Cc: Content-Type: text/plain; charset="us-ascii" ; format="flowed" Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org At 10:09 PM +0200 7/2/01, Hartmann, O. wrote: >As I posted before and read in several messages before >FreeBSD 4.3-STABLE with cvsupdate since Friday 29th June >freezes up and hungs!!! I think we need to collect more information before we talk as if it breaks all SMP boxes. I am sure that more than four people with SMP boxes have cvsup'ed to stable since last week. So, we need to find out the difference between the people with problems and the ones without. >I do not know what's going on but it is really serious! I agree it seems serious. I also agree that we don't really know what is going on yet. If we do not know what is going on, then we have to find out what is going on before presenting conclusions. I also think we only need to discuss this on one mailing list (-stable), instead of posting it to more and more mailing lists. -Stable is the right mailing list for this sort of problem, IMO. -- Garance Alistair Drosehn = gad@eclipse.acs.rpi.edu Senior Systems Programmer or gad@freebsd.org Rensselaer Polytechnic Institute or drosih@rpi.edu To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Mon Jul 2 14:14:35 2001 Delivered-To: freebsd-smp@freebsd.org Received: from netbank.com.br (garrincha.netbank.com.br [200.203.199.88]) by hub.freebsd.org (Postfix) with ESMTP id 6071037B401 for ; Mon, 2 Jul 2001 14:14:30 -0700 (PDT) (envelope-from riel@conectiva.com.br) Received: from surriel.ddts.net (1-020.ctame701-1.telepar.net.br [200.181.137.20]) by netbank.com.br (Postfix) with ESMTP id BEA2A4680D; Mon, 2 Jul 2001 18:09:20 -0300 (BRST) Received: from localhost (sntraq@localhost [127.0.0.1]) by surriel.ddts.net (8.11.4/8.11.2) with ESMTP id f62LEJ425400; Mon, 2 Jul 2001 18:14:20 -0300 Date: Mon, 2 Jul 2001 18:14:19 -0300 (BRST) From: Rik van Riel X-X-Sender: To: "E.B. Dreger" Cc: Alfred Perlstein , Julian Elischer , "Michael C . Wu" , Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: Message-ID: X-spambait: aardvark@kernelnewbies.org X-spammeplease: aardvark@nl.linux.org MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 2 Jul 2001, E.B. Dreger wrote: > > Date: Mon, 2 Jul 2001 14:11:13 -0500 > > From: Alfred Perlstein > > > > As a side issue I plan on NOT ALLOWING multiple KSEs (thread > > > carriers?) from the same thread group in the same process to be on the > > > same processor. SO load balancing and processor affinity will not > > > apply to the thread-carrying entities (KSEs). Of course the userland > > Why force things? Again, going back to affinity hinting... IMHO affinity hinting should be just that. Anything more is likely to be a solution in search of a problem ;) [yes, there are a few special cases where it may help, but it would be a bit early in the SMPng project to start worrying about those when there are more serious issues to fix ... such as locks which are known to give contention ;)] regards, Rik -- Virtual memory is like a game you can't win; However, without VM there's truly nothing to lose... http://www.surriel.com/ http://distro.conectiva.com/ Send all your spam to aardvark@nl.linux.org (spam digging piggy) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 2:41: 4 2001 Delivered-To: freebsd-smp@freebsd.org Received: from swan.mail.pas.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 1DF1437B403; Tue, 3 Jul 2001 02:40:57 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.247.139.34.Dial1.SanJose1.Level3.net [209.247.139.34]) by swan.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id CAA13395; Tue, 3 Jul 2001 02:38:09 -0700 (PDT) Message-ID: <3B419265.A6A1316D@mindspring.com> Date: Tue, 03 Jul 2001 02:37:41 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "E.B. Dreger" Cc: Bernd Walter , Peter Pentchev , Chris Costello , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "E.B. Dreger" wrote: > Any good references on MP standard? Is the lock prefix > the only way to force cache coherency? Cache coherency is managed by the MESI (modified, exclusive, shared, invalid) protocol, in hardware. The basic issue that the lock addresses is provision of a barrier instruction, so that two processes (the original one, and the result of the rfork) don't try to enter a critical section at the same time (for example, a race to lock an fd or muck with signals), and the data cache is forced to be invalidated, even if the data is already in the pipeline (that's what the barrier instruction buys you). Generally, you will use a LOCK CMPXCHG to implement MP safe mutexes in user space. If you look at the kernel SMP locking, you can probably just take that code and use it, unmodified. The reason you need to do this is that the locking in the libc_r which you are using as a basis is not MP safe: it won't prevent one processor and the other from causing a race condition in user space as a result of two or more processors being in user space in the same VM at the same time. You can download the multiprocessing section of the Pentium programmers guide from the Intel web site; it has all the information on the APIC and other guts that make SMP possible. You can also download the Intel Multiprocessing specification, version 1.4, from their web site: http://developer.intel.com/design/PentiumII/manuals/24319002.PDF http://developer.intel.com/design/PentiumII/manuals/24319202.pdf http://developer.intel.com/design/pro/datashts/24201606.pdf The last one is the MP Spec, version 1.4. Note: These are some hulking big files, so don't try to download them over a slow link, unless you are willing to wait a very long time (e.g. hours, for all of them). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 3: 6:29 2001 Delivered-To: freebsd-smp@freebsd.org Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.121.50]) by hub.freebsd.org (Postfix) with ESMTP id 303B837B401; Tue, 3 Jul 2001 03:06:19 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.247.139.34.Dial1.SanJose1.Level3.net [209.247.139.34]) by avocet.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id DAA12079; Tue, 3 Jul 2001 03:05:45 -0700 (PDT) Message-ID: <3B419910.BF346FB4@mindspring.com> Date: Tue, 03 Jul 2001 03:06:08 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Idea Receiver Cc: "E.B. Dreger" , Chris Costello , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Idea Receiver wrote: > On Fri, 29 Jun 2001, Terry Lambert wrote: > > If you "need" kernel threads, look at the Linux kernel > > threads in the ports collection (it's a kernel module > > that builds and installs as a package). You probably > > don't, since performance of kernel threads is really only > > about a 20% increment, if you implement them the SVR4 or > > Solaris (pre-2.7) or Linux way. It's probably better to > > implement with FreeBSD threads as they currently exist, > > and get massive SMP scalability when KSE's are brought > > into the source tree. > > > > just a quick question... > I konw KSE will brought in after SMPng. > but it will be really helpful to konw when it will first appear > in the source tree? They went over the design at Usenix in Boston last week; there was a big FreeBSD BOF. I believe the design is now frozen. > or what other OS can help with SMP vs pthread problem? Solaris 8 & 9 have pretty good code in this area, but are limited on scaling due to a lot of bus contention; most SVR4 derivatives claim that 4 CPUs in one box are the point of diminishing returns. Big iron from Sun is actually semi-NUMA architecture, i.e. they have loosely coupled clusters of hardware, with only a small number of shared memory multiprocessors in the same contention domain, thus avoiding the scaling issue. A couple of good starting points for looking at NUMA are: http://citeseer.nj.nec.com/12857.html http://www.ibm.com/servers/eserver/xseries/numa/index.html The last one there is the former Sequent, which was bought out by IBM. Historically, Sequent has been at the forefront of SMP systems; they were able to scale to 32 processors; they had a special bus, and ran without Intel APICs (you can only have 32 APICs, since that's all the ID space can handle, and at least one of them has to be an I/O APIC, which means using Intel's approach, you are maxed out at 31 processors and one I/O APIC; usually 30/2, actually. Sequent had a BSD-based OS called Dynix, which had a lot of smart things in it, including per processor resource pools, which is what enabled it to scale so large: it removed everything it could from the inter-CPU contention domain. FreeBSD is trying to take much of that approach. Unfortunately, they went to System V (SVR3), which then introduced a big giant lock on SMP-unsafe subsystems; in particular, only one processor was allowed into the VFS at a time, which sucked -- if you started two "ls -R" processes on two processors, then one would complete, and then the other -- but the second one wouldn't start until the lock was let go, so they were effectively being serialized, while one CPU was idle. It really ruined the usefulness of the machine. Other OSs have their own problems: VxWorks, the NetApp OS, and NetWare are all pretty allergic to SMP, since they use voluntary cooperative multitasking, where you either have to call an explicit yield, or you have to run to completion; this is very hard to SMP-ize, since you end up having to add in all of the locking that you left out to get the light weight multitasking, and they generally do not implement a seperate protection domain at all, which makes it hard to have more than one processor running at once in any case; in FreeBSD, you can have multiple CPUs in user space, but only one CPU in the kernel at a time. The -current branch tries to change this, but it's really rough going. Frankly, I predict that a fork is likely; I expect that many people will not move off the 4.x branch to 5.0, when it becomes available. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 3:17:20 2001 Delivered-To: freebsd-smp@freebsd.org Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.121.50]) by hub.freebsd.org (Postfix) with ESMTP id 467B937B403; Tue, 3 Jul 2001 03:17:13 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.247.139.34.Dial1.SanJose1.Level3.net [209.247.139.34]) by avocet.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id DAA03857; Tue, 3 Jul 2001 03:16:46 -0700 (PDT) Message-ID: <3B419BA8.3D93EB5A@mindspring.com> Date: Tue, 03 Jul 2001 03:17:12 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "E.B. Dreger" Cc: Chris Costello , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "E.B. Dreger" wrote: > [ libc_r locks don't assert "lock", not MP-safe ] > > So the "lock" prefix is the only way to enforce cache coherency? > Do you have handy a good reference on IPIs, other than the kernel > APIC code (and, of course, Google and NorthernLight searches)? See other posting. > Good to know, but, I'm not using libc_r... I was looking at > existing code to help me double-check mine as I go. I'm > synchronizing processes with a "giant lock" token that each > process cooperatively passes to the next... to simplify: > > who_has_lock++ ; > who_has_lock %= process_count ; Your unsimplified assembly is not happy, and neither is this. You want to use a LOCK CMPXCHG to implement your mutexes; the LOCK prefix makes it a barrier instruction, which is needed to ensure that two processors don't operate on their L1 cache contents, and then both attempt to write back, where one wins the race, but both think they own the lock. > Each processes' critical path first checks to see if it holds > the token; if so, it performs the tasks that require it, such as > locking a finer-grained lock or mutex. It then passes the token, > and continues through its critical path. You aren't going to be able to safely hand this off if they are running on two different processors in user space. You _must_ implement an MP safe mutex. > If a thread has nothing to do, I nanosleep(2) to prevent the critical > path from degenerating to an extended spin. I'm considering using > socketpair(2)s, with a process blocking indefinitely on read(2) until > another process write(2)s to awaken it... This would work, but it will destroy your SMP scaling that you want to achieve, since you will effectively serialize your processes running. > > If you "need" kernel threads, look at the Linux kernel > > threads in the ports collection (it's a kernel module > > that builds and installs as a package). You probably > > don't, since performance of kernel threads is really only > > Correct me if I'm wrong, but the only place in my model that really > might benefit from kthreads would be the scheduling? i.e., rather > than screwing around with nanosleep(2) or socket calls, I could cut > the cruft and interact more directly with the scheduler via kthread > mechanisms? Not really. That's the problem with Linux threads: you don't get thread-group affinity, so if you are running anything on your system besides your threaded application, you tend to take full heavy-weight context switches. Some work was done on the Linux scheduler to try and get this affinity, but you really can't do that sanely in the scheduler: it's the wrong place to attack the problem. The planned FreeBSD approach can fix this, if it's implemented correctly, since as long as you have user space threads that are ready to run, you will run out your entire quantum, and do light weight switches from one thread to another within the same process. > > about a 20% increment, if you implement them the SVR4 or > > Solaris (pre-2.7) or Linux way. It's probably better to > > implement with FreeBSD threads as they currently exist, > > and get massive SMP scalability when KSE's are brought > > into the source tree. > > KSEs... where can I read up? http://people.freebsd.org/~jasone/kse/ -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 4: 0:45 2001 Delivered-To: freebsd-smp@freebsd.org Received: from swan.mail.pas.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 05A6637B401; Tue, 3 Jul 2001 04:00:34 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.247.139.34.Dial1.SanJose1.Level3.net [209.247.139.34]) by swan.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id EAA10518; Tue, 3 Jul 2001 04:00:05 -0700 (PDT) Message-ID: <3B41A5CD.7F5FF288@mindspring.com> Date: Tue, 03 Jul 2001 04:00:29 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "E.B. Dreger" Cc: "Michael C . Wu" , Matthew Rogers , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: CPU affinity hinting References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "E.B. Dreger" wrote: > > > Date: Fri, 29 Jun 2001 21:44:43 -0500 > > From: Michael C . Wu > > > > The issue is a lot more complicated than what you think. > > How so? I know that idleproc and the new ipending / threaded INTs > enter the picture... and, after seeing the "HLT benchmark" page, it > would appear that simply doing nothing is sometimes better than > doing something, although I'm still scratching my head over that... HLT'ing reduces the overall temperature and power consumption. The current SMP-aware scheduler can't really HLT because the processors have to spin on the acquisition of the lock. > > This actually is a big issue in our future SMP implementation. > > I presumed as much; the examples I gave were trivial. > > I also assume that memory allocation is a major issue... to > not waste time with inter-CPU locking, I'd assume that memory > would be split into pools, a la Hoard. Maybe start with > approx. NPROC count equally-sized pools, which are roughly > earmarked per hypothetical process. Yes, though my personal view of that Horde allocator is that it's not nice, and I don't want to see "garbage collection" in the kernel. The mbuf allocator that has been bandied around is a specialization of the allocator that Alfred has been playing with, which is intended to address this issue. The problem with the implementations as they currently exists is that they end up locking a lot, in what I consider to be unnecessary overhead, to permit one CPU to free back to another's pool ("buckets"); this is actually much better handled by having a "dead pool" on a per CPU basis, which only gets linked onto when the free crosses a domain boundary. The actual idea for per-CPU resource pools comes from Dynix; it's described in their Usenix paper (1991), and in Vahalia's book, in chapter 12 (I actaully disagreed with his preference for the SLAB allocator, when I was doing the technical review on the book for Prentice-Hall, prior to its publication, because of this issue; most of the rest of the book, we agreed on everything else, and it was just minor nits about language, additional references, etc.. So there's a lot of prior art by a lot of smart people that FreeBSD can and has drawn upon. > I'm assuming that memory allocations are 1:1 mappable wrt > processes. Yes, I know that's faulty and oversimplified, > particularly for things like buffers and filesystem cache. FreeBSD has a unified VM and buffer cache. VM _is_ FS cache _is_ buffers. But actually your assumption is really wrong. That's because if you have a single process with multiple threads, then the threads want negaffinity -- they want to try to ensure that they are not running on the same CPU, so that they can optimize the amount of simultaneous compute resources. > > There are two types of processor affinity: user-configurable > > and system automated. We have no implementation of the former, > > Again, why not "hash(sys_auto, user_config) % NCPU"? Identical > processes would be on same CPU unless perturbed by user_config. > Collisions from identical user_config values in unrelated > processes would be less likely because of the sys_auto pertubation. > > Granted: It Is Always More Complicated. (TM) But for a first pass... The correct way to handle this is to have per CPU run queues, and only migrate processes between the queues under extraordinary circumstances (e.g. intentionally, for load balancing. Thus KSEs tend to stay put on the CPU they are run on. You also want negaffinity, as noted above. In the simple case, this can be achieved by having a 32 bit value (since you can have at most 32 processors because of the APIC ID limitation) in the proc struct; you start new KSEs on the processors whose bits are still set in the value; when a process is started initially, a bitmap of the existing CPUs is copied in as part of the startup. Bits are cleared as a process gets KSEs on each seperate CPU. Migration tries to keep KSEs on different CPUs. Each CPU has an input queue as well, which lets another CPU "hand off" processes to it, based on load. The input queue is locked for a handoff, and for a read, if the queue head is non-null, on entry to the per CPU copy of the scheduler. Thus under normal circumstances, when there is nothing in the queue, there are zero locks to deal with. Doing it this way also lets us put the HLT back into the scheduler idle loop, without losing on interrupts, since the HLT was only taken out in order to cause the CPU that didn't currently have access to the scheduler to spin on the lock until the other CPU went to user space to do work. A final piece of the puzzle is a figure of merit for guaging the CPU load for a given processor, to decide when to migrate. This can be an unlocked read-only value for other processors to decide whether to shed load to your processor, or not, based on their load being much higher than yours. To avoid barrier instructions, it's probably worth putting this information in a per CPU data page that can be seen by other CPUs, which also contains the queue head for the handoff queue (the input queue, above); barriers are avoided by marking these pages as non-cacheable. > > and alfred-vm has a semblance of the latter. Please wait > > patiently..... > > Or, if impatient, would one continue to brainstorm, not expect a > response (i.e., not get disappointed when something basic is posted), > and track -current after the destabilization? :-) I've had a number of conversations with Alfred on the ideas outlined briefly, above, and on his thoughts on the subject (he and I work at the same place). Alfred has experimental code which does per CPU run queues, as described above, and he has some other code which lets him "lock" a process onto a particular CPU (I personally don't think that's terrifically useful, in the grand scheme of things, but you can get the same effect by having a "don't migrate this process" bit, and simply not shedding it to another CPU, regardless of load. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 4:20:47 2001 Delivered-To: freebsd-smp@freebsd.org Received: from swan.mail.pas.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id D183B37B401; Tue, 3 Jul 2001 04:20:43 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.247.139.34.Dial1.SanJose1.Level3.net [209.247.139.34]) by swan.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id EAA14242; Tue, 3 Jul 2001 04:20:39 -0700 (PDT) Message-ID: <3B41AAAA.3EC17263@mindspring.com> Date: Tue, 03 Jul 2001 04:21:14 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Michael C . Wu" Cc: "E.B. Dreger" , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Quick question: AIO / SMP / process-based threading References: <20010630005749.A72545@peorth.iteration.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "Michael C . Wu" wrote: > > On Sat, Jun 30, 2001 at 05:47:49AM +0000, E.B. Dreger scribbled: > | 1. Is AIO SMP-safe? > > AIO is not safe, SMP or not. Are you maybe confusion AIO (a POSIX mandated API) with async mounts? AIO works fine, I think, and is happy with SMP. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 12:54:52 2001 Delivered-To: freebsd-smp@freebsd.org Received: from a.mx.everquick.net (a.mx.everquick.net [216.89.137.3]) by hub.freebsd.org (Postfix) with ESMTP id 7E97437B405; Tue, 3 Jul 2001 12:54:43 -0700 (PDT) (envelope-from eddy+public+spam@noc.everquick.net) Received: from localhost (eddy@localhost) by a.mx.everquick.net (8.10.2/8.10.2) with ESMTP id f63JsGf02843; Tue, 3 Jul 2001 19:54:16 GMT X-EverQuick-No-Abuse: Report any e-mail abuse to Date: Tue, 3 Jul 2001 19:54:15 +0000 (GMT) From: "E.B. Dreger" To: Terry Lambert Cc: Chris Costello , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: libc_r locking... why? In-Reply-To: <3B419BA8.3D93EB5A@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > Date: Tue, 03 Jul 2001 03:17:12 -0700 > From: Terry Lambert > > who_has_lock++ ; > > who_has_lock %= process_count ; > > Your unsimplified assembly is not happy, and neither is > this. You want to use a LOCK CMPXCHG to implement your > mutexes; the LOCK prefix makes it a barrier instruction, > which is needed to ensure that two processors don't operate > on their L1 cache contents, and then both attempt to write > back, where one wins the race, but both think they own the > lock. So I see now. Brainstorming, something like: ;; eax = my id to match with token movl $my_id,%eax ;; ecx = next process = (my_id + 1) % process_count xorl %edx,%edx leal 1(%eax,1),%ecx lock cmpl $process_count,%ecx movzl %edx,%ecx ;; edx = my id, for use after cmpxchg ;; if ( who_has_lock == my_id ) who_has_lock = ecx movl %eax,%edx lock cmpxchg %ecx,$who_has_lock ;; see what happened cmpl %edx,%eax jnz *we_didnt_have_the_token I'll look at the kernel code, compare with the above, and run with it. Thanks for beating on my head with a bigger hammer until things clicked. :-) I've bookmarked a page of "dangerous" instructions that require the lock prefix. > > If a thread has nothing to do, I nanosleep(2) to prevent the critical > > path from degenerating to an extended spin. I'm considering using > > socketpair(2)s, with a process blocking indefinitely on read(2) until > > another process write(2)s to awaken it... > > This would work, but it will destroy your SMP scaling > that you want to achieve, since you will effectively > serialize your processes running. Typo on my part. If a _process_ has nothing to do, I put the thing to sleep. I presume that it's [at least sometimes] better to have a sleeping process than to have to launch a new process. Eddy --------------------------------------------------------------------------- Brotsman & Dreger, Inc. - EverQuick Internet Division Phone: +1 (316) 794-8922 Wichita/(Inter)national Phone: +1 (785) 865-5885 Lawrence --------------------------------------------------------------------------- Date: Mon, 21 May 2001 11:23:58 +0000 (GMT) From: A Trap To: blacklist@brics.com Subject: Please ignore this portion of my mail signature. These last few lines are a trap for address-harvesting spambots. Do NOT send mail to , or you are likely to be blocked. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 16:36:51 2001 Delivered-To: freebsd-smp@freebsd.org Received: from peorth.iteration.net (peorth.iteration.net [208.190.180.178]) by hub.freebsd.org (Postfix) with ESMTP id EFD4D37B401 for ; Tue, 3 Jul 2001 16:36:48 -0700 (PDT) (envelope-from keichii@iteration.net) Received: by peorth.iteration.net (Postfix, from userid 1001) id 5EBBB59229; Tue, 3 Jul 2001 18:36:48 -0500 (CDT) Date: Tue, 3 Jul 2001 18:36:48 -0500 From: "Michael C . Wu" To: Terry Lambert Cc: freebsd-smp@FreeBSD.ORG Subject: Re: libc_r locking... why? Message-ID: <20010703183648.A14640@peorth.iteration.net> Reply-To: "Michael C . Wu" Mail-Followup-To: "Michael C . Wu" , Terry Lambert , freebsd-smp@FreeBSD.ORG References: <3B419910.BF346FB4@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3B419910.BF346FB4@mindspring.com>; from tlambert2@mindspring.com on Tue, Jul 03, 2001 at 03:06:08AM -0700 X-PGP-Fingerprint: 5025 F691 F943 8128 48A8 5025 77CE 29C5 8FA1 2E20 X-PGP-Key-ID: 0x8FA12E20 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, Jul 03, 2001 at 03:06:08AM -0700, Terry Lambert scribbled: | Idea Receiver wrote: | > On Fri, 29 Jun 2001, Terry Lambert wrote: | Solaris 8 & 9 have pretty good code in this area, but Hi Terry, Erhm, I don't quite understand. Would you mind telling me what Solaris "9" is? :) Can we use it? Michael -- +-----------------------------------------------------------+ | keichii@iteration.net | keichii@freebsd.org | | http://iteration.net/~keichii | Yes, BSD is a conspiracy. | +-----------------------------------------------------------+ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Tue Jul 3 20:21:27 2001 Delivered-To: freebsd-smp@freebsd.org Received: from server3.safepages.com (server3.safepages.com [216.127.146.5]) by hub.freebsd.org (Postfix) with ESMTP id 4828037B401 for ; Tue, 3 Jul 2001 20:21:24 -0700 (PDT) (envelope-from hank@black-hole.com) Received: from daphne.bogus (04-167.034.popsite.net [64.24.28.167]) by server3.safepages.com (Postfix) with ESMTP id 0E6915DDB; Wed, 4 Jul 2001 03:19:55 +0000 (GMT) Date: Tue, 3 Jul 2001 22:19:49 -0500 (CDT) From: Henry Miller X-Sender: hank@daphne.bogus To: "E.B. Dreger" Cc: smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Mon, 2 Jul 2001, E.B. Dreger wrote: > > First of all, we have two different types of processor affinity. > > 1. user specified CPU attachment, as you have implemented. > > 2. system-wide transparent processor affinity, transparent > > to all users, which I see some work below. > > Unless two processes are running on CPU #1, and CPU #2 becomes idle. > Then switching a process to CPU #2 makes sense... unless the process > getting switched is "close" to completion. > > I'll probably get flamed for suggesting something so ugly, but should we > assume that non-daemon processes are short-running, and be more resistant > to switching CPUs on those? Accually some OS theory says that the longer a process runs the lower priority it should get. A simple extention says that if two process are running "alot" and are on the same CPU, and there is an idle CPU, then we should switch one process to the other CPU. Small tasks that can complete in just a few time slices should be run quickly. Even with a load of 1000 on a sinlge CPU machine we should note that those other processes have been running for a while and schedual the new task more often for a few rounds, and drop the priority if it doesn't complete "quickly". (This obviously doesn't apply for time/safety critical threads) If we are even half way intellegent about schedualing initial CPU, then there is no need to bother switching CPUs for the short lived programs as they will exit before any benifit from switching CPUs would show up. FreeBSD may already do some of that, I've not checked. A deamon isn't enough for everyone, on some servers it will be good, but others it is the wrong thing. Povray for example is typically not run as a deamon, and it typically will run long enough that intellegent CPU switching will decrease the overall runtime. There are others. Of course I'm not offering to do the work, so whoever is going to gets to decide if the above is worth the bother. I can think of situations where it won't matter and situations where it will. If either is more then a rarely encountered end case is an exercise left to the reader. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Wed Jul 4 9:38:19 2001 Delivered-To: freebsd-smp@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 6D43837B401 for ; Wed, 4 Jul 2001 09:38:17 -0700 (PDT) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.3/8.11.2) id f64GcH844850; Wed, 4 Jul 2001 09:38:17 -0700 (PDT) (envelope-from dillon) Date: Wed, 4 Jul 2001 09:38:17 -0700 (PDT) From: Matt Dillon Message-Id: <200107041638.f64GcH844850@earth.backplane.com> To: freebsd-smp@freebsd.org Subject: VM Commits / GIANT_ macros Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Hello everyone! Ok, after talking with John and others at USENIX and a doing a couple of back and forths with Alfred, I am officially taking over the main-line machine-independant VM system in -current. I will also be working on i386 pmap, vm_object, vm_map, and the buffer cache (in regards to mutexes & Giant). Finally, I have begun instrumenting pieces of -current with GIANT_ macros to formalize and assert Giant conditions. These macros will also allow us to remove Giant via sysctl in a piecemeal fashion, so when someone believes they have made a section giant-free, they can use the GIANT_DEPRECIATED macro along with a sysctl to allow other developers to conditionally turn Giant off there (e.g. so your systems don't crash when you are just trying to boot or install a new kernel). Basic documentation on the GIANT_ macros is available at: http://apollo.backplane.com/FreeBSDSmp/ You do not have to lift a finger re: the GIANT_ macros if you do not want to, but I will be instrumenting them throughout the codebase as I go along. These macros only do stuff if INVARIANTS is turned on (all -current developers should obviously have INVARIENTS turned on). -- The VM mutex backout has been committed and tested w/ buildworld on IA32/UP. I tried to cleanup the other platforms as well but obviously couldn't test them. My next task is further instrumentation and the implementation of fine-grained vm_page, vm_map, and vm_object mutexes, and I will also use GIANT_DEPRECIATED to attempt to move certain syscalls (not as many as Alfred tried to move) out from under Giant. I expect this to take a number of man-days but since my time is mostly limited to weekends and evenings this could translate to a month or more in realtime. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Wed Jul 4 15:41:11 2001 Delivered-To: freebsd-smp@freebsd.org Received: from avocet.mail.pas.earthlink.net (avocet.mail.pas.earthlink.net [207.217.121.50]) by hub.freebsd.org (Postfix) with ESMTP id 0C5FF37B406 for ; Wed, 4 Jul 2001 15:41:07 -0700 (PDT) (envelope-from tlambert2@mindspring.com) Received: from mindspring.com (dialup-209.247.142.206.Dial1.SanJose1.Level3.net [209.247.142.206]) by avocet.mail.pas.earthlink.net (EL-8_9_3_3/8.9.3) with ESMTP id PAA07913; Wed, 4 Jul 2001 15:41:02 -0700 (PDT) Message-ID: <3B439B90.AECECBC0@mindspring.com> Date: Wed, 04 Jul 2001 15:41:20 -0700 From: Terry Lambert Reply-To: tlambert2@mindspring.com X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: "Michael C . Wu" Cc: freebsd-smp@FreeBSD.ORG Subject: Re: libc_r locking... why? References: <3B419910.BF346FB4@mindspring.com> <20010703183648.A14640@peorth.iteration.net> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org "Michael C . Wu" wrote: > > On Tue, Jul 03, 2001 at 03:06:08AM -0700, Terry Lambert scribbled: > | Idea Receiver wrote: > | > On Fri, 29 Jun 2001, Terry Lambert wrote: > | Solaris 8 & 9 have pretty good code in this area, but > > Hi Terry, > > Erhm, I don't quite understand. Would you mind telling me > what Solaris "9" is? :) Can we use it? Next version of Solaris, of course. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Wed Jul 4 17:22:39 2001 Delivered-To: freebsd-smp@freebsd.org Received: from mail.jeamland.net (rafe.jeamland.net [203.18.243.114]) by hub.freebsd.org (Postfix) with ESMTP id F384E37B403 for ; Wed, 4 Jul 2001 17:22:36 -0700 (PDT) (envelope-from benno@FreeBSD.org) Received: by mail.jeamland.net (Postfix, from userid 1000) id CE2FA70606; Thu, 5 Jul 2001 10:22:35 +1000 (EST) Date: Thu, 5 Jul 2001 10:22:35 +1000 From: Benno Rice To: Matt Dillon Cc: freebsd-smp@freebsd.org Subject: Re: VM Commits / GIANT_ macros Message-ID: <20010705102235.B71563@rafe.jeamland.net> References: <200107041638.f64GcH844850@earth.backplane.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="yNb1oOkm5a9FJOVX" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200107041638.f64GcH844850@earth.backplane.com>; from dillon@earth.backplane.com on Wed, Jul 04, 2001 at 09:38:17AM -0700 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org --yNb1oOkm5a9FJOVX Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Jul 04, 2001 at 09:38:17AM -0700, Matt Dillon wrote: > Hello everyone! Ok, after talking with John and others at USENIX > and a doing a couple of back and forths with Alfred, I am officially > taking over the main-line machine-independant VM system in -current. >=20 > I will also be working on i386 pmap, vm_object, vm_map, and the buffer > cache (in regards to mutexes & Giant). Could you keep me posted wrt what locking is needed in pmap? This should allow me to keep PowerPC in sync. Don't have any plans to hit SMP on power= pc any time soon, but it'd be nice to have the infrastructure in place. =3D) --=20 Benno Rice benno@FreeBSD.org --yNb1oOkm5a9FJOVX Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjtDs0sACgkQbQx7xhW+Eg606gCdGdHCtDfLZYdldyYasg1bG3ax eQkAoOIa+3lDQ/r0C+2sBAseoIov5PTg =RteI -----END PGP SIGNATURE----- --yNb1oOkm5a9FJOVX-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Thu Jul 5 8:37:35 2001 Delivered-To: freebsd-smp@freebsd.org Received: from mailhost.iprg.nokia.com (mailhost.iprg.nokia.com [205.226.5.12]) by hub.freebsd.org (Postfix) with ESMTP id 83BA937B403 for ; Thu, 5 Jul 2001 08:37:31 -0700 (PDT) (envelope-from michaelw@iprg.nokia.com) Received: from darkstar.iprg.nokia.com (darkstar.iprg.nokia.com [205.226.5.69]) by mailhost.iprg.nokia.com (8.9.3/8.9.3-GLGS) with ESMTP id IAA24225; Thu, 5 Jul 2001 08:37:30 -0700 (PDT) Received: (from root@localhost) by darkstar.iprg.nokia.com (8.11.0/8.11.0-DARKSTAR) id f65FbUa00817; Thu, 5 Jul 2001 08:37:30 -0700 X-mProtect: Thu, 5 Jul 2001 08:37:30 -0700 Nokia Silicon Valley Messaging Protection Received: from UNKNOWN (205.226.7.105, claiming to be "iprg.nokia.com") by darkstar.iprg.nokia.com(P1.5 smtpd2SaBOu; Thu, 05 Jul 2001 08:37:28 PDT Message-ID: <3B44894F.BD90890F@iprg.nokia.com> Date: Thu, 05 Jul 2001 08:35:43 -0700 From: Michael Williams Organization: Nokia X-Mailer: Mozilla 4.7 [en] (Win98; U) X-Accept-Language: en,pdf MIME-Version: 1.0 To: Henry Miller Cc: "E.B. Dreger" , smp@FreeBSD.ORG Subject: Re: per cpu runqueues, cpu affinity and cpu binding. References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Yes, how about leaving the decisions on scheduling algorithms (including affinity) until later when the cost of thread/CPU switching is known (i.e. do the work to compute the cycle count.) Then iterate between development phases of improving the time+cache burden, and modifying the thread/process scheduling criterion. From Henry's point I glean that once switching gets to be a "significant" factor in the overall system load, the meaning of "fairness" could change. Regarding daemons, I think there are different classes of daemons. Because they are long lived doesn't really say much about their workload, working set, needs for scheduling, and priority versus other processes and daemons. Michael Henry Miller wrote: > On Mon, 2 Jul 2001, E.B. Dreger wrote: > > > > First of all, we have two different types of processor affinity. > > > 1. user specified CPU attachment, as you have implemented. > > > 2. system-wide transparent processor affinity, transparent > > > to all users, which I see some work below. > > > > Unless two processes are running on CPU #1, and CPU #2 becomes idle. > > Then switching a process to CPU #2 makes sense... unless the process > > getting switched is "close" to completion. > > > > I'll probably get flamed for suggesting something so ugly, but should we > > assume that non-daemon processes are short-running, and be more resistant > > to switching CPUs on those? > > Accually some OS theory says that the longer a process runs the lower > priority it should get. A simple extention says that if two process are > running "alot" and are on the same CPU, and there is an idle CPU, then we > should switch one process to the other CPU. > > Small tasks that can complete in just a few time slices should be run > quickly. Even with a load of 1000 on a sinlge CPU machine we should note > that those other processes have been running for a while and schedual the > new task more often for a few rounds, and drop the priority if it doesn't > complete "quickly". (This obviously doesn't apply for time/safety > critical threads) If we are even half way intellegent about schedualing > initial CPU, then there is no need to bother switching CPUs for the short > lived programs as they will exit before any benifit from switching CPUs > would show up. > > FreeBSD may already do some of that, I've not checked. > > A deamon isn't enough for everyone, on some servers it will be good, but > others it is the wrong thing. Povray for example is typically not run as > a deamon, and it typically will run long enough that intellegent CPU > switching will decrease the overall runtime. There are others. > > Of course I'm not offering to do the work, so whoever is going to gets to > decide if the above is worth the bother. I can think of situations where > it won't matter and situations where it will. If either is more then a > rarely encountered end case is an exercise left to the reader. > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-smp" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Thu Jul 5 11: 3:38 2001 Delivered-To: freebsd-smp@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id A04CC37B405; Thu, 5 Jul 2001 11:03:32 -0700 (PDT) (envelope-from ticso@mail.cicely.de) Received: from mail.cicely.de (cicely20 [10.1.1.22]) by srv1.cosmo-project.de (8.11.0/8.11.0) with ESMTP id f65I3TV06694; Thu, 5 Jul 2001 20:03:29 +0200 (CEST) Received: (from ticso@localhost) by mail.cicely.de (8.11.0/8.11.0) id f65I4Ex09271; Thu, 5 Jul 2001 20:04:14 +0200 (CEST) Date: Thu, 5 Jul 2001 20:04:14 +0200 From: Bernd Walter To: Terry Lambert Cc: "Michael C . Wu" , "E.B. Dreger" , freebsd-smp@FreeBSD.ORG, freebsd-hackers@FreeBSD.ORG Subject: Re: Quick question: AIO / SMP / process-based threading Message-ID: <20010705200414.F8794@cicely20.cicely.de> References: <20010630005749.A72545@peorth.iteration.net> <3B41AAAA.3EC17263@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3B41AAAA.3EC17263@mindspring.com>; from tlambert2@mindspring.com on Tue, Jul 03, 2001 at 04:21:14AM -0700 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Tue, Jul 03, 2001 at 04:21:14AM -0700, Terry Lambert wrote: > "Michael C . Wu" wrote: > > > > On Sat, Jun 30, 2001 at 05:47:49AM +0000, E.B. Dreger scribbled: > > | 1. Is AIO SMP-safe? > > > > AIO is not safe, SMP or not. > > Are you maybe confusion AIO (a POSIX mandated API) with > async mounts? > > AIO works fine, I think, and is happy with SMP. At least there is still a warning for VFS_AIO in -currents NOTES. But I asume they are OK for devices and sockets which is much more interesting in the usual case. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Thu Jul 5 12:47:41 2001 Delivered-To: freebsd-smp@freebsd.org Received: from beppo.feral.com (beppo.feral.com [192.67.166.79]) by hub.freebsd.org (Postfix) with ESMTP id 0606E37B406; Thu, 5 Jul 2001 12:47:36 -0700 (PDT) (envelope-from mjacob@feral.com) Received: from wonky.feral.com (wonky.feral.com [192.67.166.7]) by beppo.feral.com (8.11.3/8.11.3) with ESMTP id f65JlZS35026; Thu, 5 Jul 2001 12:47:35 -0700 (PDT) (envelope-from mjacob@feral.com) Date: Thu, 5 Jul 2001 12:47:27 -0700 (PDT) From: Matthew Jacob Reply-To: To: John Baldwin Cc: Subject: cvs commit: src/sys/dev/isp isp_freebsd.c [ Giant ] In-Reply-To: Message-ID: <20010705124154.B37950-100000@wonky.feral.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org [ moving to smp, which the other discussion should done as well ] On Thu, 5 Jul 2001, John Baldwin wrote: > > On 05-Jul-01 Matt Jacob wrote: > > mjacob 2001/07/05 10:14:57 PDT > > > > Modified files: > > sys/dev/isp isp_freebsd.c > > Log: > > Things have become cinched down more tightly about assertions for Giant. > > This uncovered some missing spots where I trade off between isp's lock > > and Giant as I enter CAM. > > I would prefer that you just leave Giant held rather than releasing it until > all the code "under" the unlock/lock pair doesn't need Giant at all. I don't enter with Giant held. This is a driver marked 'safe' and has it's own lock. It needs to transition from CAM's lock (which is Giant now) to its own on entry from CAM and swap back on return. It needs to transition from it's own lock to CAM's lock when it calls CAM from interrupt context. There are reasons why I'm doing this now rather than later- the most important of which is that it keeps you honest. Anothre reason for doing this is that lacking as yet any real design for locking within CAM, at least identifying all of the handoff places Is A Good Thing (tm). The APIs and the definitions of how SMP is supposed to work in FreeBSD are there. You'll have to have a really damned good reason to convince me that I should *not* use them if they're there. -matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Thu Jul 5 19:48:17 2001 Delivered-To: freebsd-smp@freebsd.org Received: from earth.backplane.com (earth-nat-cw.backplane.com [208.161.114.67]) by hub.freebsd.org (Postfix) with ESMTP id 2C23D37B401; Thu, 5 Jul 2001 19:48:09 -0700 (PDT) (envelope-from dillon@earth.backplane.com) Received: (from dillon@localhost) by earth.backplane.com (8.11.3/8.11.2) id f662m9w62000; Thu, 5 Jul 2001 19:48:09 -0700 (PDT) (envelope-from dillon) Date: Thu, 5 Jul 2001 19:48:09 -0700 (PDT) From: Matt Dillon Message-Id: <200107060248.f662m9w62000@earth.backplane.com> To: Benno Rice Cc: freebsd-smp@FreeBSD.org Subject: Re: VM Commits / GIANT_ macros References: <200107041638.f64GcH844850@earth.backplane.com> <20010705102235.B71563@rafe.jeamland.net> Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org : : :--yNb1oOkm5a9FJOVX :Content-Type: text/plain; charset=us-ascii :Content-Disposition: inline :Content-Transfer-Encoding: quoted-printable : :On Wed, Jul 04, 2001 at 09:38:17AM -0700, Matt Dillon wrote: :> Hello everyone! Ok, after talking with John and others at USENIX :> and a doing a couple of back and forths with Alfred, I am officially :> taking over the main-line machine-independant VM system in -current. :>=20 :> I will also be working on i386 pmap, vm_object, vm_map, and the buffer :> cache (in regards to mutexes & Giant). : :Could you keep me posted wrt what locking is needed in pmap? This should :allow me to keep PowerPC in sync. Don't have any plans to hit SMP on power= :pc :any time soon, but it'd be nice to have the infrastructure in place. =3D) : :--=20 :Benno Rice :benno@FreeBSD.org Sure, I'll post updates to freebsd-smp. Here's the first update: I spent a good deal of wednesday cleaning up the VM source files, breaking them up into manageable pieces and moving vm_page_zero_idle() from MD files to a new MI file. I spent about four hours experimenting with various fine-grained VM mutex models, e.g. simply by starting to code it and noting where I would bog-down. I believe I have come up with one that is useable for vm_page_t manipulation. The issue we have with vm_page_t is that various entities currently depend on the atomic_* ops or Giant to do things like lookup a page and then busy it. This previously occured under splvm() in order to guarentee that nobody else would be able to busy the page while we were trying to. Now it occurs under Giant. The goal is to be able to do these sorts of operations without Giant. This same dependance is used to do things like add or remove a vm_page_t from its page queue, and add or remove a vm_page_t from the (object, index) hash table, and move vm_page_t's between page queues. This is the solution as I envision it. It is a considerable amount of work, which I will be doing in stages. * We will have a mutex for each (PQ_XXX) page queue. The appropriate page queue mutex will be obtained to add or remove a vm_page_t to that page queue (happens a lot), and to scan the queue (contigmalloc and the pageout daemon scan the page queues). * We will have a small shared array of mutexes to lock the (objet, index) hash chains. For example, lets say you are in vm_fault and do a vm_page_lookup() to lookup a page, and not finding it you decide to vm_page_alloc() a new page. In order to protect this sequence of events vm_page_lookup() will obtain the appropriate hash chain mutex and leave it held on return (whether or not the page is found). The caller will do whatever it needs to do (non-blocking), and then release the hash chain mutex. This allows callers to safely add or remove pages from hash chains. * Many routines now lookup a page, then busy it, then release it back onto a page queue (e.g. deactivate it, free it, activate it, cache it). e.g. vm_fault, pageout daemon, and many other interactions with the system. These interactions currently operate under Giant (used to operate under spl) and do not bother to 'own' the page to execute the action. These interactions, however, do check that the page is now owned by someone else (aka that the page is not PG_BUSY or PG_BUSY/vm_page->busy). To allow callers to safely lookup and then manipulate pages, for example to manipulate vm_page->flags, I intend to change the API such that when you nominally get a page, it will be BUSY'd for yo. For example, when selecting a free or cache page from the page queues, the page would be returned already BUSY'd, allowing you to manipulate the page and then release it back to a queue (or initiate I/O, or whatever). In many cases the caller intends to busy the page anyway, so this is not much of a leap. This only works if the page is not already busy, of course, but nearly all users of the existing API skip or sleep/loop if the returned page is busy, so we can fail gracefully and allow the caller to do whatever needs to be done there. Finally we have issues with how to set PG_BUSY in the first place. Currently setting PG_BUSY uses atomic_*. It turns out that the solution is easy and does not require the use of any additional mutex operations. * When we are looking up a page that is on the free queue, aka in vm_page_alloc(), simply holding the appropriate page queue mutex (which we *ALREADY* hold in most cases) is sufficient to allow us to manipulate the free pages in that queue without worrying about other threads messing with those pages. Thus we can set PG_BUSY, remove the page from the free queue, and then release the page queue mutex before returning the newly allocated page. * When we are looking up a page that is on the cache queue, or is not associated with a queue, we simply aquire (or already hold in most cases) the appropriate hash chain mutex. Then if the page is not already PG_BUSY, we know we can safely manipulate its flags (set PG_BUSY). If the page is already PG_BUSY we need to sleep/loop anyway, so we can fail gracefully and let the parent sleep/loop/do-whatever. * We will need to find a better way to sleep/wait for a busy page to become available. The current mechanism sets a PG_WANTED flag in vm_page->flags, which doesn't work under the new scheme. I expect I will transfer this sleep/wakeup mechanism to an array of wanted flags in parallel with the VM hash chain mutex array. Ok, so what am I going to start with? Well, I'm actually going to start with #3 ... changing the VM API to return pages that are PG_BUSY'd rather then making the caller busy them, and changing the various page queue ops (e.g. vm_page_cache(), vm_page_deactivate(), etc...) to unbusy the page automatically (some like vm_page_free() already work this way). In most cases this allows existing code to operate as it used to with only minimal changes... for example, if the existing code assumes protection by Giant (original code by splvm() and Giant) in order to retrieve, manipulate, and put back a page, the new code will be able to assume protection by the fact that it will be given a PG_BUSY'd page, which it can manipulate and put back. This preliminary work can be done without introducing VM mutexes just yet (i.e. I will do this work under Giant). But once complete, this preliminary work will allow me to then add the VM mutexes described above with very little effort and take a good chunk of the VM interface out from under Giant. -- I am not going to start work on the other major interfaces... pmap, vm_object's, buffer cache, and so forth, until I complete the work on the vm_page interface. These other interfaces work on a much more granular level that will allow us to, for example, give each vm_object a mutex (something we cannot and do not want to do for each vm_page). And that is where I stand at the moment. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Fri Jul 6 3:43:28 2001 Delivered-To: freebsd-smp@freebsd.org Received: from wiproecmx1.wipro.com (wiproecmx1.wipro.com [164.164.31.5]) by hub.freebsd.org (Postfix) with ESMTP id 3F8DB37B408 for ; Fri, 6 Jul 2001 03:43:18 -0700 (PDT) (envelope-from sumanth.vidyadhara@wipro.com) Received: from ecvwall11.wipro.com (ecvwall1.wipro.com [192.168.181.23]) by wiproecmx1.wipro.com (8.11.3/8.11.3) with SMTP id f66L2JX27164 for ; Fri, 6 Jul 2001 16:02:21 -0500 (GMT) Received: from ecvwall11.wipro.com ([192.168.181.23]) by ecmail.mail.wipro.com (Netscape Messaging Server 4.15) with SMTP id GG1SD500.92N for ; Fri, 6 Jul 2001 16:11:29 +0530 Received: from sumanth ([192.168.205.201]) by platinum.mail.wipro.com (Netscape Messaging Server 4.15) with ESMTP id GG1SEG00.2OO for ; Fri, 6 Jul 2001 16:12:16 +0530 Message-ID: <060b01c10609$acd72de0$c9cda8c0@sumanth> From: "sumanth vidyadhara" To: Subject: Locks in ethernet drivers Date: Fri, 6 Jul 2001 16:21:56 +0530 Organization: Wipro Global R&D MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org This is a multi-part message in MIME format. --------------InterScan_NT_MIME_Boundary Content-Type: multipart/alternative; boundary="----=_NextPart_000_0608_01C10637.C6554710" ------=_NextPart_000_0608_01C10637.C6554710 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi All, What is the call to initialise,acquire and release a lock in the = ethernet drivers since all the sample code did not implement any locks. The version of Freebsd is 4.3. Regards, Sumanth ------=_NextPart_000_0608_01C10637.C6554710 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi All,
 What is the call to = initialise,acquire and=20 release a lock in the ethernet drivers since all the sample code did=20 not implement any locks.
The version of Freebsd is = 4.3.
 
Regards,
Sumanth
------=_NextPart_000_0608_01C10637.C6554710-- --------------InterScan_NT_MIME_Boundary-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Fri Jul 6 7: 3:42 2001 Delivered-To: freebsd-smp@freebsd.org Received: from beppo.feral.com (beppo.feral.com [192.67.166.79]) by hub.freebsd.org (Postfix) with ESMTP id 115CF37B401 for ; Fri, 6 Jul 2001 07:03:40 -0700 (PDT) (envelope-from mjacob@feral.com) Received: from wonky.feral.com (wonky.feral.com [192.67.166.7]) by beppo.feral.com (8.11.3/8.11.3) with ESMTP id f66E3RS44935; Fri, 6 Jul 2001 07:03:27 -0700 (PDT) (envelope-from mjacob@feral.com) Date: Fri, 6 Jul 2001 07:03:17 -0700 (PDT) From: Matthew Jacob Reply-To: To: sumanth vidyadhara Cc: Subject: Re: Locks in ethernet drivers In-Reply-To: <060b01c10609$acd72de0$c9cda8c0@sumanth> Message-ID: <20010706070154.Q46987-100000@wonky.feral.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On Fri, 6 Jul 2001, sumanth vidyadhara wrote: > Hi All, > What is the call to initialise,acquire and release a lock in the > ethernet drivers since all the sample code did not implement any > locks. The version of Freebsd is 4.3. In FreeBSD 4.X there is no SMP locking that drivers need know about and splN/splx has its traditional meaning. -matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Fri Jul 6 7:41:14 2001 Delivered-To: freebsd-smp@freebsd.org Received: from wiproecmx1.wipro.com (wiproecmx1.wipro.com [164.164.31.5]) by hub.freebsd.org (Postfix) with ESMTP id D22A237B401 for ; Fri, 6 Jul 2001 07:41:06 -0700 (PDT) (envelope-from sumanth.vidyadhara@wipro.com) Received: from ecvwall11.wipro.com (ecvwall1.wipro.com [192.168.181.23]) by wiproecmx1.wipro.com (8.11.3/8.11.3) with SMTP id f6710BX09113 for ; Fri, 6 Jul 2001 20:00:11 -0500 (GMT) Received: from ecvwall11.wipro.com ([192.168.181.23]) by ecmail.mail.wipro.com (Netscape Messaging Server 4.15) with SMTP id GG23DL00.0FY for ; Fri, 6 Jul 2001 20:09:21 +0530 Received: from sumanth ([192.168.205.201]) by platinum.mail.wipro.com (Netscape Messaging Server 4.15) with ESMTP id GG23EW00.GQ1; Fri, 6 Jul 2001 20:10:08 +0530 Message-ID: <064701c1062a$e7eee230$c9cda8c0@sumanth> From: "sumanth vidyadhara" To: Cc: References: <20010706070154.Q46987-100000@wonky.feral.com> Subject: Re: Locks in ethernet drivers Date: Fri, 6 Jul 2001 20:19:48 +0530 Organization: Wipro Global R&D MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.50.4522.1200 X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org Hi Matthew, How does splxxx work for more than one cpu , splxxx is for that particular cpu isn't it. It raises the priority of cpu of that particular cpu not for all cpu. Then the critical code how is it handled for the other cpu's if they want to access it simultaneously. Regards, Sumanth ----- Original Message ----- From: "Matthew Jacob" To: "sumanth vidyadhara" Cc: Sent: Friday, July 06, 2001 7:33 PM Subject: Re: Locks in ethernet drivers > > > On Fri, 6 Jul 2001, sumanth vidyadhara wrote: > > > Hi All, > > What is the call to initialise,acquire and release a lock in the > > ethernet drivers since all the sample code did not implement any > > locks. The version of Freebsd is 4.3. > > In FreeBSD 4.X there is no SMP locking that drivers need know about and > splN/splx has its traditional meaning. > > -matt > > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Fri Jul 6 14:40:33 2001 Delivered-To: freebsd-smp@freebsd.org Received: from mass.dis.org (dhcp44-21.dis.org [216.240.44.21]) by hub.freebsd.org (Postfix) with ESMTP id 231E237B406 for ; Fri, 6 Jul 2001 14:40:31 -0700 (PDT) (envelope-from msmith@mass.dis.org) Received: from mass.dis.org (localhost [127.0.0.1]) by mass.dis.org (8.11.4/8.11.3) with ESMTP id f66Lrsa01793; Fri, 6 Jul 2001 14:53:58 -0700 (PDT) (envelope-from msmith@mass.dis.org) Message-Id: <200107062153.f66Lrsa01793@mass.dis.org> X-Mailer: exmh version 2.1.1 10/15/1999 To: "sumanth vidyadhara" Cc: freebsd-smp@FreeBSD.ORG Subject: Re: Locks in ethernet drivers In-reply-to: Your message of "Fri, 06 Jul 2001 20:19:48 +0530." <064701c1062a$e7eee230$c9cda8c0@sumanth> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Fri, 06 Jul 2001 14:53:54 -0700 From: Mike Smith Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org > Hi Matthew, > How does splxxx work for more than one cpu , splxxx is for that particular > cpu isn't it. > It raises the priority of cpu of that particular cpu not for all cpu. > Then the critical code how is it handled for the other cpu's if they want to > access it simultaneously. If you know how spl() works, why are you asking this question? And since you've already been told the answer, why are you arguing? spl() is implemented correctly for multiple processors. Use it. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Fri Jul 6 15:27:44 2001 Delivered-To: freebsd-smp@freebsd.org Received: from mail6.speakeasy.net (mail6.speakeasy.net [216.254.0.206]) by hub.freebsd.org (Postfix) with SMTP id 8902B37B401 for ; Fri, 6 Jul 2001 15:27:37 -0700 (PDT) (envelope-from jhb@FreeBSD.org) Received: (qmail 43528 invoked from network); 6 Jul 2001 22:27:36 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail6.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 6 Jul 2001 22:27:36 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <064701c1062a$e7eee230$c9cda8c0@sumanth> Date: Fri, 06 Jul 2001 15:27:33 -0700 (PDT) From: John Baldwin To: sumanth vidyadhara Subject: Re: Locks in ethernet drivers Cc: freebsd-smp@FreeBSD.ORG, mjacob@feral.com Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org On 06-Jul-01 sumanth vidyadhara wrote: > Hi Matthew, > How does splxxx work for more than one cpu , splxxx is for that particular > cpu isn't it. > It raises the priority of cpu of that particular cpu not for all cpu. > Then the critical code how is it handled for the other cpu's if they want to > access it simultaneously. On 4.x the entire kernel is protected by a giant spinlock. -- John Baldwin -- http://www.FreeBSD.org/~jhb/ PGP Key: http://www.baldwin.cx/~john/pgpkey.asc "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sat Jul 7 14: 9:22 2001 Delivered-To: freebsd-smp@freebsd.org Received: from ms39.hinet.net (ms39.hinet.net [168.95.4.39]) by hub.freebsd.org (Postfix) with ESMTP id 2727137B40E for ; Sat, 7 Jul 2001 14:08:50 -0700 (PDT) (envelope-from jonahk@ms39.hinet.net) Received: from profe (61-216-140-183.HINET-IP.hinet.net [61.216.140.183]) by ms39.hinet.net (8.8.8/8.8.8) with SMTP id FAA00214 for ; Sun, 8 Jul 2001 05:08:43 +0800 (CST) Message-ID: <006301c10728$f9707e70$2e53a8c0@profe> From: "Jonah Kuo" To: Subject: mp is slower than sp kernel. Date: Sun, 8 Jul 2001 05:08:30 +0800 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0060_01C1076C.073C9DC0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6700 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org This is a multi-part message in MIME format. ------=_NextPart_000_0060_01C1076C.073C9DC0 Content-Type: text/plain; charset="big5" Content-Transfer-Encoding: quoted-printable Hello, If this has been answered, I'm sorry! This is what I observed and I'm really not sure how would this happen, but I'm sure my mp kernel is faster several months ago. So asking you=20 professionals would be my first try, thanks!=20 =20 The following is my tail'ed output of make world for sp and mp, = respectively: bash-2.04$ cat 0707-sp.j16 >>> Rebuilding man page indices -------------------------------------------------------------- cd /usr/src/share/man; make makedb makewhatis /usr/share/man makewhatis /usr/share/perl/man rm -rf /tmp/install.11298 -------------------------------------------------------------- >>> elf make world completed on Sat Jul 7 20:11:54 CST 2001 (started Sat Jul 7 16:28:15 CST 2001) -------------------------------------------------------------- bash-2.04$ cat 0708-mp.j16 >>> Rebuilding man page indices -------------------------------------------------------------- cd /usr/src/share/man; make makedb makewhatis /usr/share/man makewhatis /usr/share/perl/man rm -rf /tmp/install.49914 -------------------------------------------------------------- >>> elf make world completed on Sun Jul 8 03:05:11 CST 2001 (started Sat Jul 7 21:10:07 CST 2001) -------------------------------------------------------------- This two 'make world' are using same copy of anything: * smp box. (pentium200-mmx * 2) asus mb. It's only a test box, nobody uses it except me. * /etc/make.conf * kernel config=20 * source, which I cvsup'ed at Jul 7 16:06=20 * I always do a 'chflags -R noschg and rm -rf /usr/obj/*' before making world. And related info is attached: bash-2.04$ cat /var/run/dmesg.boot Copyright (c) 1992-2001 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights = reserved. FreeBSD 4.3-STABLE #0: Sat Jul 7 20:29:24 CST 2001 root@vita:/usr/obj/usr/src/sys/VITA4 Timecounter "i8254" frequency 1193182 Hz CPU: Pentium/P55C (199.43-MHz 586-class CPU) Origin =3D "GenuineIntel" Id =3D 0x543 Stepping =3D 3 Features=3D0x8003bf real memory =3D 167772160 (163840K bytes) avail memory =3D 159690752 (155948K bytes) Programming 24 pins in IOAPIC #0 IOAPIC #0 intpin 2 -> irq 0 IOAPIC #0 intpin 18 -> irq 11 IOAPIC #0 intpin 19 -> irq 10 FreeBSD/SMP: Multiprocessor motherboard cpu0 (BSP): apic id: 0, version: 0x00030010, at 0xfee00000 cpu1 (AP): apic id: 1, version: 0x00030010, at 0xfee00000 io0 (APIC): apic id: 2, version: 0x00170011, at 0xfec00000 Preloaded elf kernel "kernel" at 0xc036b000. Preloaded elf module "linux.ko" at 0xc036b09c. Preloaded elf module "ipl.ko" at 0xc036b13c. Preloaded elf module "netgraph.ko" at 0xc036b1d8. Intel Pentium detected, installing workaround for F00F bug npx0: on motherboard npx0: INT 16 interface pcib0: on motherboard pci0: on pcib0 isab0: at device 1.0 on pci0 isa0: on isab0 atapci0: port 0xe800-0xe80f at device 1.1 = on pci0 ata0: at 0x1f0 irq 14 on atapci0 ata1: at 0x170 irq 15 on atapci0 vr0: port 0xe000-0xe07f mem = 0xe3000000-0xe3000 07f irq 11 at device 10.0 on pci0 vr0: Ethernet address: 00:80:c8:92:0c:35 miibus0: on vr0 amphy0: on miibus0 amphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto pci0: <3Dfx Voodoo Banshee graphics accelerator> at 12.0 orm0:
Hello,
 
If this has been answered, I'm=20 sorry!
 
This is what I observed and I'm = really not=20 sure how would this happen,
but I'm sure my mp kernel is = faster several=20 months ago. So asking you =
professionals would be my first = try,=20 thanks! 
 
The following is my tail'ed = output of make=20 world for sp and mp, respectively:
 
bash-2.04$ cat = 0707-sp.j16
>>>=20 Rebuilding man page=20 indices
--------------------------------------------------------------=
cd=20 /usr/src/share/man; make makedb
makewhatis = /usr/share/man
makewhatis=20 /usr/share/perl/man
rm -rf=20 /tmp/install.11298
---------------------------------------------------= -----------
>>>=20 elf make world completed on Sat Jul  7 20:11:54 CST=20 2001
           = ;            = =20 (started Sat Jul  7 16:28:15 CST=20 2001)
--------------------------------------------------------------
 
bash-2.04$ cat = 0708-mp.j16
>>>=20 Rebuilding man page=20 indices
--------------------------------------------------------------=
cd=20 /usr/src/share/man; make makedb
makewhatis = /usr/share/man
makewhatis=20 /usr/share/perl/man
rm -rf=20 /tmp/install.49914
---------------------------------------------------= -----------
>>>=20 elf make world completed on Sun Jul  8 03:05:11 CST=20 2001
           = ;            = =20 (started Sat Jul  7 21:10:07 CST=20 2001)
--------------------------------------------------------------
 
This two 'make world' are using = same copy=20 of anything:
 
 * smp box. = (pentium200-mmx * 2) asus=20 mb. It's only a test box, nobody
   uses it except=20 me.
 * = /etc/make.conf
 * kernel config =
 * source, which I = cvsup'ed at=20 Jul 7 16:06 
 * I always do a 'chflags = -R noschg=20 and rm -rf /usr/obj/*' before
   making = world.
 
 
And related info is = attached:
 
bash-2.04$ cat=20 /var/run/dmesg.boot
Copyright (c) 1992-2001 The FreeBSD = Project.
Copyright=20 (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,=20 1994
        The Regents of the = University=20 of California. All rights reserved.
FreeBSD 4.3-STABLE #0: Sat = Jul  7=20 20:29:24 CST 2001
   =20 root@vita:/usr/obj/usr/src/sys/VITA4
Timecounter "i8254"  = frequency=20 1193182 Hz
CPU: Pentium/P55C (199.43-MHz 586-class CPU)
  = Origin =3D=20 "GenuineIntel"  Id =3D 0x543  Stepping =3D 3
 =20 Features=3D0x8003bf<FPU,VME,DE,PSE,TSC,MSR,MCE,CX8,APIC,MMX>
rea= l=20 memory  =3D 167772160 (163840K bytes)
avail memory =3D 159690752 = (155948K=20 bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 -> = irq=20 0
IOAPIC #0 intpin 18 -> irq 11
IOAPIC #0 intpin 19 -> irq=20 10
FreeBSD/SMP: Multiprocessor motherboard
 cpu0 (BSP): apic=20 id:  0, version: 0x00030010, at 0xfee00000
 cpu1 = (AP):  apic=20 id:  1, version: 0x00030010, at 0xfee00000
 io0 (APIC): = apic=20 id:  2, version: 0x00170011, at 0xfec00000
Preloaded elf kernel = "kernel"=20 at 0xc036b000.
Preloaded elf module "linux.ko" at = 0xc036b09c.
Preloaded=20 elf module "ipl.ko" at 0xc036b13c.
Preloaded elf module "netgraph.ko" = at=20 0xc036b1d8.
Intel Pentium detected, installing workaround for F00F=20 bug
npx0: <math processor> on motherboard
npx0: INT 16=20 interface
pcib0: <Host to PCI bridge> on motherboard
pci0: = <PCI=20 bus> on pcib0
isab0: <Intel 82371SB PCI to ISA bridge> at = device 1.0=20 on pci0
isa0: <ISA bus> on isab0
atapci0: <Intel PIIX3 = ATA=20 controller> port 0xe800-0xe80f at device 1.1 on pci0
ata0: at = 0x1f0 irq 14=20 on atapci0
ata1: at 0x170 irq 15 on atapci0
vr0: <VIA VT3043 = Rhine I=20 10/100BaseTX> port 0xe000-0xe07f mem 0xe3000000-0xe3000
07f irq 11 = at=20 device 10.0 on pci0
vr0: Ethernet address: = 00:80:c8:92:0c:35
miibus0:=20 <MII bus> on vr0
amphy0: <DM9101 10/100 media interface> = on=20 miibus0
amphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, = auto
pci0: <3Dfx Voodoo Banshee graphics accelerator> at = 12.0
orm0:=20 <Option ROM> at iomem 0xc0000-0xc7fff on isa0
fdc0: <NEC = 72065B or=20 clone> at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO = enabled, 8=20 bytes threshold
fd0: <1440-KB 3.5" drive> on fdc0 drive = 0
atkbdc0:=20 <Keyboard controller (i8042)> at port 0x60,0x64 on isa0
atkbd0: = <AT=20 Keyboard> irq 1 on atkbdc0
vga0: <Generic ISA VGA> at port=20 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0
sc0: <System console> = on=20 isa0
sc0: VGA <16 virtual consoles, flags=3D0x200>
sio0 at = port=20 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1 at = port=20 0x2f8-0x2ff irq 3 on isa0
sio1: type 16550A
ppc0: <Parallel = port> at=20 port 0x378-0x37f irq 7 on isa0
ppc0: SMC-like chipset = (ECP/EPP/PS2/NIBBLE) in=20 COMPATIBLE mode
ppc0: FIFO with 16/16/16 bytes threshold
lpt0:=20 <Printer> on ppbus0
lpt0: Interrupt-driven port
sbc0: = <Creative=20 ViBRA16C> at port 0x220-0x22f,0x330-0x331,0x388-0x38b irq 5 = drq
1,5 on=20 isa0
pcm0: <SB16 DSP 4.13> on sbc0
APIC_IO: Testing 8254 = interrupt=20 delivery
APIC_IO: routing 8254 via IOAPIC #0 intpin 2
IPsec: = Initialized=20 Security Association Processing.
IP Filter: v3.4.16 = initialized. =20 Default =3D pass all, Logging =3D enabled
SMP: AP CPU #1 = Launched!
ad0:=20 32253MB <IBM-DTLA-307045> [65531/16/63] at ata0-master = WDMA2
Mounting=20 root from ufs:/dev/ad0s1a
 
bash-2.04$ mptable
 
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D
 
MPTable, version = 2.0.15
 
----------------------------------------------------------------= ---------------
 
MP Floating Pointer = Structure:
 
 =20 location:          &nbs= p;         =20 BIOS
  physical=20 address:           = ; =20 0x000f6960
 =20 signature:          &nb= sp;        =20 '_MP_'
 =20 length:           =            =20 16 bytes
 =20 version:           = ;          =20 1.1
 =20 checksum:          &nbs= p;         =20 0xdb
 =20 mode:           &n= bsp;           &nb= sp;=20 Virtual Wire
 
----------------------------------------------------------------= ---------------
 
MP Config Table = Header:
 
  physical=20 address:           = ; =20 0x000f6554
 =20 signature:          &nb= sp;        =20 'PCMP'
  base table=20 length:           = =20 244
 =20 version:           = ;          =20 1.1
 =20 checksum:          &nbs= p;         =20 0x42
  OEM=20 ID:           &nbs= p;          =20 'OEM00000'
  Product=20 ID:           &nbs= p;      =20 'PROD00000000'
  OEM table=20 pointer:           = ;=20 0x00000000
  OEM table=20 size:           &n= bsp;  =20 0
  entry=20 count:           &= nbsp;     =20 22
  local APIC=20 address:          =20 0xfee00000
  extended table=20 length:        0
  extended = table=20 checksum:      0
 
----------------------------------------------------------------= ---------------
 
MP Config Base Table = Entries:
 
--
Processors:    =20 APIC ID Version=20 State           = Family =20 Model   Step   =20 Flags
          &nbs= p;     =20 0       0x10    BSP,=20 usable     5      =20 4       = 3      =20 0x8003b
f
         &nb= sp;      =20 1       0x10    AP,=20 usable      = 5      =20 4       = 3      =20 0x8003b
f
--
Bus:        = ;   =20 Bus ID =20 Type
           = ;     =20 0      =20 PCI
           =      =20 1       ISA
--
I/O=20 APICs:      APIC ID Version=20 State          =20 Address
          &n= bsp;     =20 2       0x11   =20 usable         =20 0xfec00000
--
I/O Ints:      =20 Type    Polarity   =20 Trigger     Bus ID   IRQ    = APIC ID=20 PIN#
           = ;    =20 ExtINT   conforms   =20 conforms        = 1    =20 0          = 2   =20 0
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 1          = 2   =20 1
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 0          = 2   =20 2
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 3          = 2   =20 3
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 4          = 2   =20 4
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 5          = 2   =20 5
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 6          = 2   =20 6
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 7          = 2   =20 7
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 8          = 2   =20 8
           &n= bsp;   =20 INT      conforms   =20 conforms        = 1    =20 9          = 2   =20 9
           &n= bsp;   =20 INT      conforms   =20 conforms        1   =20 12          2  =20 12
           &= nbsp;   =20 INT      conforms   =20 conforms        1   =20 14          2  =20 14
           &= nbsp;   =20 INT      conforms   =20 conforms        1   =20 15          2  =20 15
           &= nbsp;   =20 INT     = active-lo      =20 level        1   =20 11          2  =20 18
           &= nbsp;   =20 INT     = active-lo      =20 level        1   =20 10          2  =20 19
--
Local Ints:     Type   =20 Polarity    Trigger     Bus = ID  =20 IRQ    APIC ID=20 PIN#
           = ;    =20 ExtINT  active-hi       =20 edge        1     = 0        255   =20 0
           &n= bsp;   =20 NMI     = active-hi       =20 edge        1     = 0        255    = 1
 
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D
------=_NextPart_000_0060_01C1076C.073C9DC0-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sat Jul 7 14:36:59 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id 915F437B406 for ; Sat, 7 Jul 2001 14:36:56 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id 07D235D01F; Sat, 7 Jul 2001 16:36:46 -0500 (CDT) Date: Sat, 7 Jul 2001 16:36:45 -0500 From: Alfred Perlstein To: Jonah Kuo Cc: freebsd-smp@freebsd.org Subject: Re: mp is slower than sp kernel. Message-ID: <20010707163645.B88962@sneakerz.org> References: <006301c10728$f9707e70$2e53a8c0@profe> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i In-Reply-To: <006301c10728$f9707e70$2e53a8c0@profe>; from jonahk@ms39.hinet.net on Sun, Jul 08, 2001 at 05:08:30AM +0800 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org * Jonah Kuo [010707 16:09] wrote: > Hello, > > If this has been answered, I'm sorry! > > This is what I observed and I'm really not sure how would this happen, > but I'm sure my mp kernel is faster several months ago. So asking you > professionals would be my first try, thanks! > > The following is my tail'ed output of make world for sp and mp, respectively: ... > > This two 'make world' are using same copy of anything: > > * smp box. (pentium200-mmx * 2) asus mb. It's only a test box, nobody > uses it except me. > * /etc/make.conf > * kernel config > * source, which I cvsup'ed at Jul 7 16:06 > * I always do a 'chflags -R noschg and rm -rf /usr/obj/*' before > making world. How do expect to gain more performance when running "make world" unless you tell make to run more than one job in parallel? Try doing "make -jN world" where N is some number from 4 to 12 or so, you should see the improvement. As far as the reasoning for the slowdown for non-parallel compiles, the reason for that is that managing two processors is more expensive than the single processor model, so there's some overhead to take care of. -Alfred To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sat Jul 7 15:50:27 2001 Delivered-To: freebsd-smp@freebsd.org Received: from ms39.hinet.net (ms39.hinet.net [168.95.4.39]) by hub.freebsd.org (Postfix) with ESMTP id A9CF237B409 for ; Sat, 7 Jul 2001 15:50:24 -0700 (PDT) (envelope-from jonahk@ms39.hinet.net) Received: from profe (61-216-141-140.HINET-IP.hinet.net [61.216.141.140]) by ms39.hinet.net (8.8.8/8.8.8) with SMTP id GAA10150; Sun, 8 Jul 2001 06:50:20 +0800 (CST) Message-ID: <000c01c10737$2b9e12a0$2e53a8c0@profe> From: "Jonah Kuo" To: "Alfred Perlstein" Cc: References: <006301c10728$f9707e70$2e53a8c0@profe> <20010707163645.B88962@sneakerz.org> Subject: Re: mp is slower than sp kernel. Date: Sun, 8 Jul 2001 06:50:07 +0800 MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 7bit X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 5.00.2919.6700 X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2919.6700 Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org ----- Original Message ----- From: "Alfred Perlstein" To: "Jonah Kuo" Cc: Sent: Sunday, July 08, 2001 5:36 AM Subject: Re: mp is slower than sp kernel. > How do expect to gain more performance when running "make world" > unless you tell make to run more than one job in parallel? > > Try doing "make -jN world" where N is some number from 4 to 12 > or so, you should see the improvement. > Yes, of course I did, I use '-j16', the suffix '.j16' is it stands for. BTW, I should remove 'kernel' in my subject, I should say "mp is slower than sp". To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message From owner-freebsd-smp Sat Jul 7 19:34:34 2001 Delivered-To: freebsd-smp@freebsd.org Received: from sneakerz.org (sneakerz.org [216.33.66.254]) by hub.freebsd.org (Postfix) with ESMTP id 7FECF37B401; Sat, 7 Jul 2001 19:34:32 -0700 (PDT) (envelope-from bright@sneakerz.org) Received: by sneakerz.org (Postfix, from userid 1092) id E73D45D01F; Sat, 7 Jul 2001 21:34:31 -0500 (CDT) Date: Sat, 7 Jul 2001 21:34:31 -0500 From: Alfred Perlstein To: smp@freebsd.org Cc: jhb@freebsd.org, jake@freebsd.org Subject: trapsignal+ktrace looks broken. Message-ID: <20010707213431.J88962@sneakerz.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2i Sender: owner-freebsd-smp@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.org In kern/kern_sig.c:trapsignal(), there's this: { register struct sigacts *ps = p->p_sigacts; PROC_LOCK(p); if ((p->p_flag & P_TRACED) == 0 && SIGISMEMBER(p->p_sigcatch, sig) && !SIGISMEMBER(p->p_sigmask, sig)) { p->p_stats->p_ru.ru_nsignals++; #ifdef KTRACE if (KTRPOINT(p, KTR_PSIG)) ktrpsig(p->p_tracep, sig, ps->ps_sigact[_SIG_IDX(sig)], &p->p_sigmask, code); #endif PROC_UNLOCK(p); /* XXX ??? */ (*p->p_sysent->sv_sendsig)(ps->ps_sigact[_SIG_IDX(sig)], sig, Anyhow, ktrpsig() does some IO, which I'm quite sure may result in a sleep, I'm also quite certain that would result in a panic with witness. -- -Alfred Perlstein [alfred@freebsd.org] Ok, who wrote this damn function called '??'? And why do my programs keep crashing in it? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-smp" in the body of the message