From owner-freebsd-arch Tue Jan 1 1: 9:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id B84D337B41F for ; Tue, 1 Jan 2002 01:09:25 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g0199FZ38164; Tue, 1 Jan 2002 10:09:15 +0100 (CET) (envelope-from ticso@cicely8.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g0198Ptx034207; Tue, 1 Jan 2002 10:08:25 +0100 (CET)?g (envelope-from ticso@cicely8.cicely.de) Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g0198OW10702; Tue, 1 Jan 2002 10:08:24 +0100 (CET) Received: (from ticso@localhost) by cicely8.cicely.de (8.11.6/8.11.6) id g0198Ne98249; Tue, 1 Jan 2002 10:08:23 +0100 (CET) (envelope-from ticso) Date: Tue, 1 Jan 2002 10:08:23 +0100 From: Bernd Walter To: Michal Mertl Cc: Matthew Dillon , arch@FreeBSD.ORG Subject: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020101100822.B96092@cicely8.cicely.de> References: <200112292016.fBTKGWR01735@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.23i X-Operating-System: FreeBSD cicely8.cicely.de 5.0-CURRENT i386 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sun, Dec 30, 2001 at 01:48:07AM +0100, Michal Mertl wrote: > On Sat, 29 Dec 2001, Matthew Dillon wrote: > > > :You can use cmpxchg8b on SMP systems (it's available on all machines that > > :support SMP I think) and use non-SMP versions otherwise where needed. You > > :would just implement the atomic_foo_64 versions this way. You would need to > > :use cmpxchg8b instead of addl/adcl for the acq and rel variants for SMP. > > > > This seems quite reasonable to me. > > > > Yes. I wrote the atomic functions (set, add, get) with cmpxchg8b. I also > measured the preformance and here are the results (100 mil additions on > pII 366): > > default 32 bit implementation took 1.25821 secs > atomic 32 bit implementation (from ) took 1.74043 secs > default 64 bit implementation took 2.226189 secs > atomic 64 bit implementation took 5.205156 secs atomic(9) says in the description that they can be used as an syncronisation primitive, which is why at least alphas implementation includes memory barriers. MBs are not needed for the variable itself, but they are making this family of functions very expensive. It's not very wise to handle counters with atomic_ functions unless the need to have MBs in them is not removed. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 1:55:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from winston.freebsd.org (adsl-64-173-15-98.dsl.sntc01.pacbell.net [64.173.15.98]) by hub.freebsd.org (Postfix) with ESMTP id 6CC3337B41F; Tue, 1 Jan 2002 01:55:29 -0800 (PST) Received: from winston.freebsd.org (jkh@localhost [127.0.0.1]) by winston.freebsd.org (8.11.6/8.11.6) with ESMTP id g019tRE78703; Tue, 1 Jan 2002 01:55:27 -0800 (PST) (envelope-from jkh@winston.freebsd.org) To: Steve Price Cc: Murray Stokely , John Baldwin , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? In-Reply-To: Message from Steve Price of "Mon, 31 Dec 2001 18:39:28 CST." <20011231183928.F37696@bsd.havk.org> Date: Tue, 01 Jan 2002 01:55:27 -0800 Message-ID: <78699.1009878927@winston.freebsd.org> From: Jordan Hubbard Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG It wouldn't be that hard in sysinstall either, depending on how the X bits are packaged. FWIW, I also think that XFree86 4.x's time has come. - Jordan > On Mon, Dec 31, 2001 at 04:22:22PM -0800, Murray Stokely wrote: > > > > I agree that we should make the switch in -STABLE immediately after > > FreeBSD 4.5 is released. Every other x86 Unix I've been exposed to > > has been using X4 for over a year. The arguments about older > > supported hardware do not hold water any more since X4 supports so > > many newer chipsets out of the box that X336 can't handle. > > All of the recent package builds have been with XFREE86_VERSION=4. > I'm not sure how long for sure but for as long as I can remember > which isn't saying much. Making it the default shouldn't be all > that difficult in bsd.port.mk. Don't know how hard it would be > for sysinstall and friends. > > -steve > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 3:47:38 2002 Delivered-To: freebsd-arch@freebsd.org Received: from areilly.bpc-users.org (CPE-144-132-240-160.nsw.bigpond.net.au [144.132.240.160]) by hub.freebsd.org (Postfix) with SMTP id 00E0137B41B for ; Tue, 1 Jan 2002 03:47:34 -0800 (PST) Received: (qmail 25312 invoked by uid 1000); 1 Jan 2002 11:47:34 -0000 From: "Andrew Reilly" Date: Tue, 1 Jan 2002 22:47:34 +1100 To: Alfred Perlstein Cc: John Baldwin , Julian Elischer , arch@FreeBSD.ORG, Steve Kargl Subject: Re: Kernel Thread scheduler Message-ID: <20020101224733.A25053@gurney.reilly.home> References: <20011122012838.V13393@elvis.mu.org> <20011122014109.W13393@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20011122014109.W13393@elvis.mu.org>; from bright@mu.org on Thu, Nov 22, 2001 at 01:41:09AM -0600 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Nov 22, 2001 at 01:41:09AM -0600, Alfred Perlstein wrote: > Why do we even care? When was the last time wine was good for > anything besideds barely being able to run solitare on FreeBSD > anyhow? I know that this is a long dead thread, but I'm on holidays now, and have time to catch up with the FreeBSD lists... Wine is actually very useful (to me) and has been for several years. I run two applications under it that there aren't other alternatives for (besides firing up a windows box): It runs the Motorola DSP assembler, tools and simulator beautifully. A simple shell wrapper makes asm56300 work better under Berkeley make under gvim or xemacs than it probably ever has under windows. It runs (mostly) the Lotus Notes client (there seem to be some thread-related hangs, and don't hit an HTTP link, because it will go looking for internet explorer). They're important for me. Word documents I care much less about, and StarOffice and AbiWord both deal with, with varying degrees of success. -- Andrew To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 5:36:27 2002 Delivered-To: freebsd-arch@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 581A237B422; Tue, 1 Jan 2002 05:36:24 -0800 (PST) Received: from pool0063.cvx40-bradley.dialup.earthlink.net ([216.244.42.63] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16LP5T-0007Ii-00; Tue, 01 Jan 2002 05:36:15 -0800 Message-ID: <3C31BB4D.8F580A4D@mindspring.com> Date: Tue, 01 Jan 2002 05:36:13 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Andrew Reilly Cc: Alfred Perlstein , John Baldwin , Julian Elischer , arch@FreeBSD.ORG, Steve Kargl Subject: Re: Kernel Thread scheduler References: <20011122012838.V13393@elvis.mu.org> <20011122014109.W13393@elvis.mu.org> <20020101224733.A25053@gurney.reilly.home> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Andrew Reilly wrote: > On Thu, Nov 22, 2001 at 01:41:09AM -0600, Alfred Perlstein wrote: > > Why do we even care? When was the last time wine was good for > > anything besideds barely being able to run solitare on FreeBSD > > anyhow? > > I know that this is a long dead thread, but I'm on holidays now, > and have time to catch up with the FreeBSD lists... > > Wine is actually very useful (to me) and has been for several > years. I run two applications under it that there aren't other > alternatives for (besides firing up a windows box): Two comments: 1) The TSS switch can be lazy-bound, and so a TSS is not needed per process (setting a TSS per process would have the unfortunate effect of limiting you to 1023 processes... 1024 - 1 for the kernel). This is why Linux removed the TSS dependency on context switches, and why FreeBSD never had it (limit of one 4k page of 32 bit entries = 1024). FreeBSD WINE support has historically lazy-bound this. 2) WINE has a terrible historical track record overall; I personally blame this on their "percentage entry point coverage" calculation, which always fails on the Nth call, which is unimplemented, and then claims ((N-1)/N *100) as the percentage of interface coverage when it's a complete statistical lie. The real question is where is the TSS going to be lazy-bound; I think it's still OK to do it at the process level, and can't see an application for threads (any VM86 calls could be thunked to a single TSS in the kernel for use for VM86, and it's really not necessary with any modern processors, which have an IRQ fast path in hardware, anyway, even if it is a poorly documented one. See The MindShare Protected Mode book). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 8:46: 7 2002 Delivered-To: freebsd-arch@freebsd.org Received: from iona.dcs.gla.ac.uk (iona.dcs.gla.ac.uk [130.209.240.35]) by hub.freebsd.org (Postfix) with ESMTP id A6D9637B405; Tue, 1 Jan 2002 08:45:56 -0800 (PST) Received: from therese.dcs.gla.ac.uk ([130.209.241.134] helo=therese.dcs.gla.ac.uk.dcs.gla.ac.uk) by iona.dcs.gla.ac.uk with esmtp (Exim 3.13 #1) id 16LS31-0007I5-00; Tue, 01 Jan 2002 16:45:55 +0000 Received: by therese.dcs.gla.ac.uk.dcs.gla.ac.uk (8.11.3/Dumb) id g01GjrO19663; Tue, 1 Jan 2002 16:45:53 GMT To: Wilko Bulte Cc: Murray Stokely , John Baldwin , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? References: <20011231174113.O16101@elvis.mu.org> <20011231162222.V2286@windriver.com> <20020101013240.B4434@freebie.xs4all.nl> From: Rolf Neugebauer Date: 01 Jan 2002 16:45:53 +0000 In-Reply-To: <20020101013240.B4434@freebie.xs4all.nl> Message-ID: Lines: 14 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/20.7 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Wilko Bulte writes: > > Does anybody out there know how well X4 works on the Alpha? > Works very smooth with an MGA 2 card in a 164LX (build from ports half a year ago). Much better in fact than XF3. No more occasional freezes as I noticed with the XF3 (this was also a problem under Linux). I don't know about other cards but some Alpha/Linux folks were/are fairly active in getting/keeping XF4 working nicely on alpha. Rolf To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 10:15:55 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id 95AD237B41D; Tue, 1 Jan 2002 10:15:52 -0800 (PST) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.6/8.11.5) with SMTP id g01IFnD14852; Tue, 1 Jan 2002 13:15:49 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Tue, 1 Jan 2002 13:15:49 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: John Baldwin Cc: Alfred Perlstein , arch@FreeBSD.org, re@FreeBSD.org Subject: RE: xfree4 by default? In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, 31 Dec 2001, John Baldwin wrote: > > On 31-Dec-01 Alfred Perlstein wrote: > > What is the proceedure one must follow to present xfree4 as the new > > default? It's been around for a long time and support a LOT more > > chipsets a lot better. Can we go ahead and just pull some switch > > or are there more sinister issues involved? > > The last time I and others brought this up to Jordan, the tentative plan > was to switch for 5.0. At this point in the game it is probably a bit > late for 4.5, however, if you want to head up an effort to get test port > builds done after the release with 4 as the default and get people to > test it out then perhaps a switch could be done for 4.6. When I raised the switch last, I was told we needed more compatibility glue to make sure libraries were installed appropriately for some applications, and to handle ports dependencies correctly. Ports probably needs to be tought to have applications depend either on X11 generally (not version-specific), or on appropriate X11 libs by version, not the entire X implementation. That way applications that could run against both sets of .so's would use whatever was installed, and those requiring specific libraries could get them as dependencies without pulling in (say) the old xterm over the default one. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 12:40:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mass.dis.org (mass.dis.org [216.240.45.41]) by hub.freebsd.org (Postfix) with ESMTP id 442CB37B41B for ; Tue, 1 Jan 2002 12:40:10 -0800 (PST) Received: from mass.dis.org (localhost [127.0.0.1]) by mass.dis.org (8.11.6/8.11.3) with ESMTP id g01KmYr01192; Tue, 1 Jan 2002 12:48:34 -0800 (PST) (envelope-from msmith@mass.dis.org) Message-Id: <200201012048.g01KmYr01192@mass.dis.org> X-Mailer: exmh version 2.1.1 10/15/1999 To: Bernd Walter Cc: Michal Mertl , Matthew Dillon , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-reply-to: Your message of "Tue, 01 Jan 2002 10:08:23 +0100." <20020101100822.B96092@cicely8.cicely.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 01 Jan 2002 12:48:34 -0800 From: Mike Smith Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > MBs are not needed for the variable itself, but they are making this > family of functions very expensive. > It's not very wise to handle counters with atomic_ functions unless > the need to have MBs in them is not removed. It's imperative to use atomic operations for counters on SMP systems. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 13: 1:35 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id D392437B41F; Tue, 1 Jan 2002 13:01:31 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id IAA20393; Wed, 2 Jan 2002 08:01:28 +1100 Date: Wed, 2 Jan 2002 08:01:23 +1100 (EST) From: Bruce Evans X-X-Sender: To: Mike Smith Cc: Bernd Walter , Michal Mertl , Matthew Dillon , Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <200201012048.g01KmYr01192@mass.dis.org> Message-ID: <20020102075650.L11121-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 1 Jan 2002, Mike Smith wrote: > > MBs are not needed for the variable itself, but they are making this > > family of functions very expensive. > > It's not very wise to handle counters with atomic_ functions unless > > the need to have MBs in them is not removed. > > It's imperative to use atomic operations for counters on SMP systems. Not true. Atomic operations for counters are not needed on SMP systems in at least the following cases: - if there is a lock that prevents other processes from accessing the counter - if the counters are per-CPU. See previous mail by someone named msmith. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 13: 7:53 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id B20BB37B420; Tue, 1 Jan 2002 13:07:25 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id IAA20921; Wed, 2 Jan 2002 08:07:22 +1100 Date: Wed, 2 Jan 2002 08:07:18 +1100 (EST) From: Bruce Evans X-X-Sender: To: John Baldwin Cc: Michal Mertl , , Matthew Dillon , Alfred Perlstein Subject: Re: 64 bit counters In-Reply-To: Message-ID: <20020102080550.B11165-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Mon, 31 Dec 2001, John Baldwin wrote: > On 31-Dec-01 Michal Mertl wrote: > > I've almost finished the changes to implement interface counters with > > atomic_t which is either 32 bit or 64 bit. I'll finish it anyway. It can > > at least be converted to atomic_add_int (instead of my WIP name > > atomic_add_max and friends) and it will help someone later to be able to > > get rid of giant protection. I may be wrong even with this - well if > > that's going to be the case and otherwise there will be no use for my > > changes, they can always be forgotten. I could take it as practice lesson > > on hacking kernel sources :-). > > I really need to fix the atomic(9) API to use sizeof() instead of explicit > types like Bruce wants. :) This will be easier when there is only one size :-). Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 13:27:34 2002 Delivered-To: freebsd-arch@freebsd.org Received: from espresso.q9media.com (espresso.q9media.com [216.254.138.122]) by hub.freebsd.org (Postfix) with ESMTP id A530237B639; Tue, 1 Jan 2002 13:27:17 -0800 (PST) Received: (from mike@localhost) by espresso.q9media.com (8.11.6/8.11.6) id g01LOlf64937; Tue, 1 Jan 2002 16:24:47 -0500 (EST) (envelope-from mike) Date: Tue, 1 Jan 2002 16:24:47 -0500 From: Mike Barcroft To: Bruce Evans Cc: Peter Pentchev , Mike Smith , arch@FreeBSD.ORG Subject: Re: kldload(2) family (was Re: loadable aio) Message-ID: <20020101162447.A64468@espresso.q9media.com> References: <20011231043633.E45114@espresso.q9media.com> <20020101025947.L6989-100000@gamplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020101025947.L6989-100000@gamplex.bde.org>; from bde@zeta.org.au on Tue, Jan 01, 2002 at 03:22:40AM +1100 Organization: The FreeBSD Project Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bruce Evans writes: > There was some discussion of the brokenness in followup to this commit. > Relative paths like "foo.ko" should work in the Unix way IMO (the "./" > in "./foo.ko" was apparently a failed attempt to prevent searching > like it would in the shell). The syscall should not do any searching, > especially not in a misdesigned undocumented insecure search path with > DOS-style separators. Right. I think the searching problem inside the kernel could be conquered by requiring that the destination of the kernel modules be known at compile-time. > > Here's how I would design this interface: > > o _kldload(2) accepts a file path (similar to open(2)) > > o kldunload(2) accepts a filename (no path) > > o kldload(3) accepts a module name (procfs), file name (procfs.ko), or > > file path (/boot/kernel/procfs.ko). > > o Search paths for kldload(3) are controlled by the environment > > variable `KLDPATH' (similar to MANPATH and PATH). > > o When kldload(3) locates a module file, it calls _kldload(2). > > o kldload(8) uses kldload(3) > > o kldunload(8) uses kldunload(2) > > > > The main advantage of this design is that it allows a Unix programmer > > to utilize it. :) > > I would intentionally leave out kldload(3). There is a problem converting > relative names to absolute ones for auto-loading things like filesystems > in the kernel, but there is no problem in userland -- the modules are > wherever you installed or built them. I was concerned that some people might want compatibility with the current behavior, but that doesn't seem to be the case (from this and other private e-mails). The interface becomes much simpler without kldload(3). o kldload(2) accepts a file path (similar to open(2)) o kldunload(2) accepts a filename (no path, eg: "procfs.ko") o kldload(8) uses kldload(2) o kldunload(8) uses kldunload(2) (this could do extra string manipulation for compatibility with `kldunload procfs'). o Module dependencies are handled by explicitly hardcoding the path to the module directory at compile-time. The only problem I see with hardcoding the path is when loading modules in a chroot() environment. For example: %%% chroot("/boot"); chdir("/"); kldload("kernel/procfs.ko"); %%% Obviously, the pseudofs.ko dependency would fail because `/boot/kernel/pseudofs.ko' doesn't exist, but this is a common problem in chroot() and jail() environments and is usually overcome by recreating the directory hierarchy. Best regards, Mike Barcroft To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 13:30:41 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id A926E37B41C; Tue, 1 Jan 2002 13:30:37 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g01LUZA70352; Tue, 1 Jan 2002 22:30:35 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g01LUZtx037692; Tue, 1 Jan 2002 22:30:35 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g01LUYW11207; Tue, 1 Jan 2002 22:30:34 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g01LUX010611; Tue, 1 Jan 2002 22:30:33 +0100 (CET) (envelope-from ticso) Date: Tue, 1 Jan 2002 22:30:33 +0100 From: Bernd Walter To: Mike Smith Cc: Bernd Walter , Michal Mertl , Matthew Dillon , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020101213033.GB9899@cicely9.cicely.de> References: <20020101100822.B96092@cicely8.cicely.de> <200201012048.g01KmYr01192@mass.dis.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201012048.g01KmYr01192@mass.dis.org> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Jan 01, 2002 at 12:48:34PM -0800, Mike Smith wrote: > > MBs are not needed for the variable itself, but they are making this > > family of functions very expensive. > > It's not very wise to handle counters with atomic_ functions unless > > the need to have MBs in them is not removed. > > It's imperative to use atomic operations for counters on SMP systems. But there is absolutely no need for MBs just to handle counters. If they are used for counters they shouldn't be used for syncronisation. The current situation is quite irritating and I absolutely don't know if the MBs are wrong or the use of the functions to handle counters. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 13:41: 9 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mass.dis.org (mass.dis.org [216.240.45.41]) by hub.freebsd.org (Postfix) with ESMTP id 0E08D37B425; Tue, 1 Jan 2002 13:41:05 -0800 (PST) Received: from mass.dis.org (localhost [127.0.0.1]) by mass.dis.org (8.11.6/8.11.3) with ESMTP id g01Lnhr01849; Tue, 1 Jan 2002 13:49:43 -0800 (PST) (envelope-from msmith@mass.dis.org) Message-Id: <200201012149.g01Lnhr01849@mass.dis.org> X-Mailer: exmh version 2.1.1 10/15/1999 To: Mike Barcroft Cc: Bruce Evans , Peter Pentchev , Mike Smith , arch@FreeBSD.ORG Subject: Re: kldload(2) family (was Re: loadable aio) In-reply-to: Your message of "Tue, 01 Jan 2002 16:24:47 EST." <20020101162447.A64468@espresso.q9media.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 01 Jan 2002 13:49:42 -0800 From: Mike Smith Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > Right. I think the searching problem inside the kernel could be > conquered by requiring that the destination of the kernel modules be > known at compile-time. This is absurd. It's (only) known at runtime. The kernel directory can be (and is) moved aribitrarily. It has to be done inside the kernel to allow demand-loading by the kernel. > I was concerned that some people might want compatibility with the > current behavior, but that doesn't seem to be the case (from this and > other private e-mails). The interface becomes much simpler without > kldload(3). > > o kldload(2) accepts a file path (similar to open(2)) > o kldunload(2) accepts a filename (no path, eg: "procfs.ko") > o kldload(8) uses kldload(2) > o kldunload(8) uses kldunload(2) (this could do extra string > manipulation for compatibility with `kldunload procfs'). > o Module dependencies are handled by explicitly hardcoding the path to > the module directory at compile-time. This demonstrates a basic misunderstanding of how the module system works; you have not grasped the distinction between "module" and "file" at all. The interface, from a userland perspective, is driven by three basic requirements: 1) Load a named module. 2) Load a named file. 3) Unload a named module. 1) needs to search a database (in this case, built by kldxref) of known modules to locate the file containing the module. It has a fallback behaviour of assuming that the module 'foo' is found in the file 'foo.ko' somewhere on the module searchpath. 2) takes a filename argument. As a convenience to the programmer, it will append the '.ko' suffix, and search the system default path for the file if an explicit pathname is not given. 3) is obvious, but misimplemented. There are a lot of problems with the current implementation, but you are not addressing them in this thread. Don't butcher what is actually quite a clean and functional design just because you don't understand it; try fixing some of the really major conceptual problems, not dumbing it down to the level of the implementation. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 13:53:17 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mass.dis.org (mass.dis.org [216.240.45.41]) by hub.freebsd.org (Postfix) with ESMTP id 70CF137B41C for ; Tue, 1 Jan 2002 13:53:14 -0800 (PST) Received: from mass.dis.org (localhost [127.0.0.1]) by mass.dis.org (8.11.6/8.11.3) with ESMTP id g01M1nr02027; Tue, 1 Jan 2002 14:01:49 -0800 (PST) (envelope-from msmith@mass.dis.org) Message-Id: <200201012201.g01M1nr02027@mass.dis.org> X-Mailer: exmh version 2.1.1 10/15/1999 To: Bernd Walter Cc: arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-reply-to: Your message of "Tue, 01 Jan 2002 22:30:33 +0100." <20020101213033.GB9899@cicely9.cicely.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 01 Jan 2002 14:01:49 -0800 From: Mike Smith Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > On Tue, Jan 01, 2002 at 12:48:34PM -0800, Mike Smith wrote: > > > MBs are not needed for the variable itself, but they are making this > > > family of functions very expensive. > > > It's not very wise to handle counters with atomic_ functions unless > > > the need to have MBs in them is not removed. > > > > It's imperative to use atomic operations for counters on SMP systems. > > But there is absolutely no need for MBs just to handle counters. So fix the atomic functions on the Alpha so that the mb's are only present when the cache-related semantics (acquire/release) are specified. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 14:34:30 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id BDB0237B429; Tue, 1 Jan 2002 14:34:26 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g01MYOA71737; Tue, 1 Jan 2002 23:34:24 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g01MHstx038121; Tue, 1 Jan 2002 23:17:54 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g01MHrW11250; Tue, 1 Jan 2002 23:17:53 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g01MHrj10719; Tue, 1 Jan 2002 23:17:53 +0100 (CET) (envelope-from ticso) Date: Tue, 1 Jan 2002 23:17:52 +0100 From: Bernd Walter To: Mike Smith Cc: arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020101221752.GA10614@cicely9.cicely.de> References: <20020101213033.GB9899@cicely9.cicely.de> <200201012201.g01M1nr02027@mass.dis.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201012201.g01M1nr02027@mass.dis.org> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, Jan 01, 2002 at 02:01:49PM -0800, Mike Smith wrote: > > On Tue, Jan 01, 2002 at 12:48:34PM -0800, Mike Smith wrote: > > > > MBs are not needed for the variable itself, but they are making this > > > > family of functions very expensive. > > > > It's not very wise to handle counters with atomic_ functions unless > > > > the need to have MBs in them is not removed. > > > > > > It's imperative to use atomic operations for counters on SMP systems. > > > > But there is absolutely no need for MBs just to handle counters. > > So fix the atomic functions on the Alpha so that the mb's are only > present when the cache-related semantics (acquire/release) are specified. Ah - that makes sense now. Thanks. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 15:49:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 99E3E37B42A; Tue, 1 Jan 2002 15:49:27 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g01NnKA40071; Tue, 1 Jan 2002 15:49:20 -0800 (PST) (envelope-from dillon) Date: Tue, 1 Jan 2002 15:49:20 -0800 (PST) From: Matthew Dillon Message-Id: <200201012349.g01NnKA40071@apollo.backplane.com> To: Bruce Evans Cc: Mike Smith , Bernd Walter , Michal Mertl , Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <20020102075650.L11121-100000@gamplex.bde.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :On Tue, 1 Jan 2002, Mike Smith wrote: : :> > MBs are not needed for the variable itself, but they are making this :> > family of functions very expensive. :> > It's not very wise to handle counters with atomic_ functions unless :> > the need to have MBs in them is not removed. :> :> It's imperative to use atomic operations for counters on SMP systems. : :Not true. Atomic operations for counters are not needed on SMP systems :in at least the following cases: :- if there is a lock that prevents other processes from accessing the : counter :- if the counters are per-CPU. See previous mail by someone named msmith. : :Bruce Well, I'm not sure how I got on the Cc list but I agree with Bruce on this one. An SMP-synchronized counter increment is a ridiculous waste of time. They should be per-cpu and then we don't care *how* wide the counters are. Having programs like netstat, or our sysctl mechanism, aggregate the count values is easy. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 16:23: 9 2002 Delivered-To: freebsd-arch@freebsd.org Received: from espresso.q9media.com (espresso.q9media.com [216.254.138.122]) by hub.freebsd.org (Postfix) with ESMTP id E506D37B41E; Tue, 1 Jan 2002 16:23:01 -0800 (PST) Received: (from mike@localhost) by espresso.q9media.com (8.11.6/8.11.6) id g020KgT65555; Tue, 1 Jan 2002 19:20:42 -0500 (EST) (envelope-from mike) Date: Tue, 1 Jan 2002 19:20:42 -0500 From: Mike Barcroft To: Mike Smith Cc: Bruce Evans , Peter Pentchev , arch@freebsd.org Subject: Re: kldload(2) family (was Re: loadable aio) Message-ID: <20020101192042.C64468@espresso.q9media.com> References: <20020101162447.A64468@espresso.q9media.com> <200201012149.g01Lnhr01849@mass.dis.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201012149.g01Lnhr01849@mass.dis.org>; from msmith@freebsd.org on Tue, Jan 01, 2002 at 01:49:42PM -0800 Organization: The FreeBSD Project Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Mike Smith writes: > > Right. I think the searching problem inside the kernel could be > > conquered by requiring that the destination of the kernel modules be > > known at compile-time. > > This is absurd. It's (only) known at runtime. The kernel directory can > be (and is) moved aribitrarily. It has to be done inside the kernel to > allow demand-loading by the kernel. Can you give an example? Remember, my proposal is only for module dependencies that need to search for a file. > > I was concerned that some people might want compatibility with the > > current behavior, but that doesn't seem to be the case (from this and > > other private e-mails). The interface becomes much simpler without > > kldload(3). > > > > o kldload(2) accepts a file path (similar to open(2)) > > o kldunload(2) accepts a filename (no path, eg: "procfs.ko") > > o kldload(8) uses kldload(2) > > o kldunload(8) uses kldunload(2) (this could do extra string > > manipulation for compatibility with `kldunload procfs'). > > o Module dependencies are handled by explicitly hardcoding the path to > > the module directory at compile-time. > > This demonstrates a basic misunderstanding of how the module system > works; you have not grasped the distinction between "module" and "file" > at all. > > The interface, from a userland perspective, is driven by three basic > requirements: > > 1) Load a named module. > 2) Load a named file. > 3) Unload a named module. > > 1) needs to search a database (in this case, built by kldxref) of known > modules to locate the file containing the module. It has a fallback > behaviour of assuming that the module 'foo' is found in the file 'foo.ko' > somewhere on the module searchpath. What is the syscall interface for this functionality? Presumably it's overloaded onto the kldload(2) interface? :) Here's what the kldload(2) manual says it does: : kldload - load KLD files into the kernel : : DESCRIPTION : The function kldload() loads a kld file into the kernel using the kernel : linker. Is this incorrect? > 2) takes a filename argument. As a convenience to the programmer, it > will append the '.ko' suffix, and search the system default path for the > file if an explicit pathname is not given. I don't consider this a convenience, nor do I consider open(2) searching the filesystem for a file I wish to open appropriate. > 3) is obvious, but misimplemented. > > There are a lot of problems with the current implementation, but you are > not addressing them in this thread. Don't butcher what is actually quite > a clean and functional design just because you don't understand it; try > fixing some of the really major conceptual problems, not dumbing it down > to the level of the implementation. Obviously I don't understand this. If you recall, earlier I called this interface black magic. I was hoping this thread would answer the question of why this interface is so evil, but you apparently saw this as an opportunity to insult my intelligence. Best regards, Mike Barcroft To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 17:37: 5 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mass.dis.org (mass.dis.org [216.240.45.41]) by hub.freebsd.org (Postfix) with ESMTP id 9E96F37B41F for ; Tue, 1 Jan 2002 17:36:59 -0800 (PST) Received: from mass.dis.org (localhost [127.0.0.1]) by mass.dis.org (8.11.6/8.11.3) with ESMTP id g021jZr03903; Tue, 1 Jan 2002 17:45:35 -0800 (PST) (envelope-from msmith@mass.dis.org) Message-Id: <200201020145.g021jZr03903@mass.dis.org> X-Mailer: exmh version 2.1.1 10/15/1999 To: Matthew Dillon Cc: Bruce Evans , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-reply-to: Your message of "Tue, 01 Jan 2002 15:49:20 PST." <200201012349.g01NnKA40071@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Date: Tue, 01 Jan 2002 17:45:35 -0800 From: Mike Smith Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > > :On Tue, 1 Jan 2002, Mike Smith wrote: > :> > :> It's imperative to use atomic operations for counters on SMP systems. > : > :Not true. Atomic operations for counters are not needed on SMP systems > :in at least the following cases: > :- if there is a lock that prevents other processes from accessing the > : counter > :- if the counters are per-CPU. See previous mail by someone named msmith. 8) Sorry, I should have been more explicit, rather than assuming that the audience would read the implicit exclusion of alternatives into the above. -- ... every activity meets with opposition, everyone who acts has his rivals and unfortunately opponents also. But not because people want to be opponents, rather because the tasks and relationships force people to take different points of view. [Dr. Fritz Todt] V I C T O R Y N O T V E N G E A N C E To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Tue Jan 1 23: 6: 4 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail11.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by hub.freebsd.org (Postfix) with ESMTP id 3F5C537B41E for ; Tue, 1 Jan 2002 23:05:48 -0800 (PST) Received: (qmail 30962 invoked from network); 2 Jan 2002 07:05:46 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 2 Jan 2002 07:05:46 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <78699.1009878927@winston.freebsd.org> Date: Tue, 01 Jan 2002 23:05:35 -0800 (PST) From: John Baldwin To: Jordan Hubbard Subject: Re: xfree4 by default? Cc: re@FreeBSD.ORG, arch@FreeBSD.ORG, Alfred Perlstein , Murray Stokely , Steve Price Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 01-Jan-02 Jordan Hubbard wrote: > It wouldn't be that hard in sysinstall either, depending on how > the X bits are packaged. FWIW, I also think that XFree86 4.x's time > has come. Actually, the current tools in src/release/x11 will work with X 4 as well as far as generating the necessary tarballs. They would just need some slight tweaks to change what ports are built. This would allow one to not have to change anything in sysinstall but the comments. However, switching over to using the various XFree86-4-* ports would be cleaner assuming they are kept as up to date as the XFree86-4 port is. > - Jordan > >> On Mon, Dec 31, 2001 at 04:22:22PM -0800, Murray Stokely wrote: >> > >> > I agree that we should make the switch in -STABLE immediately after >> > FreeBSD 4.5 is released. Every other x86 Unix I've been exposed to >> > has been using X4 for over a year. The arguments about older >> > supported hardware do not hold water any more since X4 supports so >> > many newer chipsets out of the box that X336 can't handle. >> >> All of the recent package builds have been with XFREE86_VERSION=4. >> I'm not sure how long for sure but for as long as I can remember >> which isn't saying much. Making it the default shouldn't be all >> that difficult in bsd.port.mk. Don't know how hard it would be >> for sysinstall and friends. >> >> -steve >> >> To Unsubscribe: send mail to majordomo@FreeBSD.org >> with "unsubscribe freebsd-arch" in the body of the message > -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 6:21:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mongrel.pacific.net.au (mongrel.pacific.net.au [61.8.0.107]) by hub.freebsd.org (Postfix) with ESMTP id 6D65337B417; Wed, 2 Jan 2002 06:21:46 -0800 (PST) Received: from dungeon.home (ppp38.dyn248.pacific.net.au [203.143.248.38]) by mongrel.pacific.net.au (8.9.3/8.9.3/Debian 8.9.3-21) with ESMTP id BAA25419; Thu, 3 Jan 2002 01:15:40 +1100 X-Authentication-Warning: mongrel.pacific.net.au: Host ppp38.dyn248.pacific.net.au [203.143.248.38] claimed to be dungeon.home Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.11.3/8.11.1) with ESMTP id g02EPks11288; Thu, 3 Jan 2002 00:25:46 +1000 (EST) (envelope-from mckay) Message-Id: <200201021425.g02EPks11288@dungeon.home> To: Jordan Hubbard Cc: Murray Stokely , John Baldwin , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG, mckay@thehub.com.au Subject: Re: xfree4 by default? References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> In-Reply-To: <78699.1009878927@winston.freebsd.org> from Jordan Hubbard at "Tue, 01 Jan 2002 09:55:27 +0000" Date: Thu, 03 Jan 2002 00:25:46 +1000 From: Stephen McKay Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tuesday, 1st January 2002, Jordan Hubbard wrote: >It wouldn't be that hard in sysinstall either, depending on how >the X bits are packaged. FWIW, I also think that XFree86 4.x's time >has come. None of my current video cards work properly with 3.3.6, so I'm all for adding 4.1.0 immediately, rather than post release. Any chance? Stephen. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 6:50:16 2002 Delivered-To: freebsd-arch@freebsd.org Received: from tao.org.uk (genius.tao.org.uk [212.135.162.51]) by hub.freebsd.org (Postfix) with ESMTP id 9235937B41A; Wed, 2 Jan 2002 06:50:05 -0800 (PST) Received: by tao.org.uk (Postfix, from userid 100) id 24FC654D; Wed, 2 Jan 2002 14:49:57 +0000 (GMT) Date: Wed, 2 Jan 2002 14:49:57 +0000 From: Josef Karthauser To: Steve Price Cc: Murray Stokely , John Baldwin , Alfred Perlstein , arch@freebsd.org, re@freebsd.org Subject: Re: xfree4 by default? Message-ID: <20020102144957.J4619@tao.org.uk> References: <20011231174113.O16101@elvis.mu.org> <20011231162222.V2286@windriver.com> <20011231183928.F37696@bsd.havk.org> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="J2uG6jHjFLimDtBY" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20011231183928.F37696@bsd.havk.org>; from steve@freebsd.org on Mon, Dec 31, 2001 at 06:39:28PM -0600 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --J2uG6jHjFLimDtBY Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, Dec 31, 2001 at 06:39:28PM -0600, Steve Price wrote: > On Mon, Dec 31, 2001 at 04:22:22PM -0800, Murray Stokely wrote: > >=20 > > I agree that we should make the switch in -STABLE immediately after > > FreeBSD 4.5 is released. Every other x86 Unix I've been exposed to > > has been using X4 for over a year. The arguments about older > > supported hardware do not hold water any more since X4 supports so > > many newer chipsets out of the box that X336 can't handle. >=20 > All of the recent package builds have been with XFREE86_VERSION=3D4. > I'm not sure how long for sure but for as long as I can remember > which isn't saying much. Making it the default shouldn't be all > that difficult in bsd.port.mk. Don't know how hard it would be > for sysinstall and friends. What's the best way to use it? We appear to have a monolithic port XFree86-4, and then -clients, -documents, -libraries, -manuals, imake, etc. The dependancies don't appear to work too well between them, when using portupgrade. Joe --J2uG6jHjFLimDtBY Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjwzHhQACgkQXVIcjOaxUBZFHgCfYCsHg3qvFMUd2hHehRIVouOR FS8An35TQOAlrBQGTHGjSpCr1m4xtyRU =g4NT -----END PGP SIGNATURE----- --J2uG6jHjFLimDtBY-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 6:54:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from prg.traveller.cz (prg.traveller.cz [193.85.2.77]) by hub.freebsd.org (Postfix) with ESMTP id 7176F37B421; Wed, 2 Jan 2002 06:54:07 -0800 (PST) Received: from prg.traveller.cz (localhost [127.0.0.1]) by prg.traveller.cz (8.12.1[KQ-CZ](1)/8.12.1/pukvis) with ESMTP id g02Erxlk053667; Wed, 2 Jan 2002 15:53:59 +0100 (CET) Received: from localhost (mime@localhost) by prg.traveller.cz (8.12.1[KQ-CZ](1)/pukvis) with ESMTP id g02ErwxE053664; Wed, 2 Jan 2002 15:53:58 +0100 (CET) Date: Wed, 2 Jan 2002 15:53:55 +0100 (CET) From: Michal Mertl To: Matthew Dillon Cc: Bruce Evans , Mike Smith , Bernd Walter , Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <200201012349.g01NnKA40071@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Tue, 1 Jan 2002, Matthew Dillon wrote: > > :On Tue, 1 Jan 2002, Mike Smith wrote: > : > :> > MBs are not needed for the variable itself, but they are making this > :> > family of functions very expensive. > :> > It's not very wise to handle counters with atomic_ functions unless > :> > the need to have MBs in them is not removed. > :> > :> It's imperative to use atomic operations for counters on SMP systems. > : > :Not true. Atomic operations for counters are not needed on SMP systems > :in at least the following cases: > :- if there is a lock that prevents other processes from accessing the > : counter > :- if the counters are per-CPU. See previous mail by someone named msmith. > : > :Bruce > > Well, I'm not sure how I got on the Cc list but I agree with Bruce > on this one. An SMP-synchronized counter increment is a ridiculous > waste of time. They should be per-cpu and then we don't care *how* > wide the counters are. Having programs like netstat, or our sysctl > mechanism, aggregate the count values is easy. > I don't know how much time will be wasted - my measurements on pII show the atomic_ operations aren't that expensive. There is a lot of counters and to have all of them for each processor would waste a bit of memory but more importantly it would require some structural changes - which may end up causing counters update being even more expensive that atomic_. Using atomic ops is easy. Even if you somewhere forget to use atomic_, if would still work - only with slight chance the value become wrong. Anyway I'm just building world with my changes (only kernel and netstat modified). I'm sure I missed several places to do some changes but at least it runs and counts (I hope I'm wrong - I used egrep a lot). Do I understand correctly that on i386 I don't need anything special for atomic_XXX_{rel|acq}? I implemented only one version (with cmpxchg8b - on SMP with lock prepended) and others are just #defines to use the same function. My patch will break other archs but I'll look at their atomic.h and see if I can make appropriate changes. -- Michal Mertl mime@traveller.cz To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 9: 8:33 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id AB5D737B41A; Wed, 2 Jan 2002 09:08:28 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g02H8AV96088; Wed, 2 Jan 2002 18:08:11 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g02H8Ttx043387; Wed, 2 Jan 2002 18:08:29 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g02H8SW12541; Wed, 2 Jan 2002 18:08:28 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g02H8N613296; Wed, 2 Jan 2002 18:08:23 +0100 (CET) (envelope-from ticso) Date: Wed, 2 Jan 2002 18:08:22 +0100 From: Bernd Walter To: Michal Mertl Cc: Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020102170822.GB10762@cicely9.cicely.de> References: <200201012349.g01NnKA40071@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 03:53:55PM +0100, Michal Mertl wrote: > On Tue, 1 Jan 2002, Matthew Dillon wrote: > > :Not true. Atomic operations for counters are not needed on SMP systems > > :in at least the following cases: > > :- if there is a lock that prevents other processes from accessing the > > : counter > > :- if the counters are per-CPU. See previous mail by someone named msmith. > > : > > :Bruce > > > > Well, I'm not sure how I got on the Cc list but I agree with Bruce > > on this one. An SMP-synchronized counter increment is a ridiculous > > waste of time. They should be per-cpu and then we don't care *how* > > wide the counters are. Having programs like netstat, or our sysctl > > mechanism, aggregate the count values is easy. > > > > I don't know how much time will be wasted - my measurements on pII show > the atomic_ operations aren't that expensive. After Mike Smith cleared my understanding problem I agree. What I don't agree is that aggregating per CPU counters is easy. The data has to be pushed to the coherency point by each CPU to be readable by a single CPU, which doesn't happen for shure if you don't use specialiesed write methods and you may end up with old values. And you also have to deal with overflow issues if the updating is interruptable. I don't say it's a bad idea, but you have to be selective and it still may be the best choose for network counters. > There is a lot of counters and to have all of them for each processor > would waste a bit of memory but more importantly it would require some > structural changes - which may end up causing counters update being even > more expensive that atomic_. Using atomic ops is easy. Even if you > somewhere forget to use atomic_, if would still work - only with slight > chance the value become wrong. As long as your architecture has atomic_ capabilities for the needed size. > Anyway I'm just building world with my changes (only kernel and netstat > modified). I'm sure I missed several places to do some changes but at > least it runs and counts (I hope I'm wrong - I used egrep a lot). > > Do I understand correctly that on i386 I don't need anything special for > atomic_XXX_{rel|acq}? I implemented only one version (with cmpxchg8b - on > SMP with lock prepended) and others are just #defines to use the same > function. What I understood from Mikes mail is that only acq/rel pairs protect more that just the given value - others may not. On i386 this is not an issue because it's a coherent archicture. > My patch will break other archs but I'll look at their atomic.h and see if > I can make appropriate changes. I havn't seen one yet. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 9:27:23 2002 Delivered-To: freebsd-arch@freebsd.org Received: from prg.traveller.cz (prg.traveller.cz [193.85.2.77]) by hub.freebsd.org (Postfix) with ESMTP id 96E6E37B417; Wed, 2 Jan 2002 09:27:17 -0800 (PST) Received: from prg.traveller.cz (localhost [127.0.0.1]) by prg.traveller.cz (8.12.1[KQ-CZ](1)/8.12.1/pukvis) with ESMTP id g02HRBlk065963; Wed, 2 Jan 2002 18:27:11 +0100 (CET) Received: from localhost (mime@localhost) by prg.traveller.cz (8.12.1[KQ-CZ](1)/pukvis) with ESMTP id g02HRApW065960; Wed, 2 Jan 2002 18:27:10 +0100 (CET) Date: Wed, 2 Jan 2002 18:27:10 +0100 (CET) From: Michal Mertl To: Bernd Walter Cc: Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <20020102170822.GB10762@cicely9.cicely.de> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 2 Jan 2002, Bernd Walter wrote: > On Wed, Jan 02, 2002 at 03:53:55PM +0100, Michal Mertl wrote: > > On Tue, 1 Jan 2002, Matthew Dillon wrote: > > > :Not true. Atomic operations for counters are not needed on SMP systems > > > :in at least the following cases: > > > :- if there is a lock that prevents other processes from accessing the > > > : counter > > > :- if the counters are per-CPU. See previous mail by someone named msmith. > > > : > > > :Bruce > > > > > > Well, I'm not sure how I got on the Cc list but I agree with Bruce > > > on this one. An SMP-synchronized counter increment is a ridiculous > > > waste of time. They should be per-cpu and then we don't care *how* > > > wide the counters are. Having programs like netstat, or our sysctl > > > mechanism, aggregate the count values is easy. > > > > > > > I don't know how much time will be wasted - my measurements on pII show > > the atomic_ operations aren't that expensive. > > After Mike Smith cleared my understanding problem I agree. > > What I don't agree is that aggregating per CPU counters is easy. > The data has to be pushed to the coherency point by each CPU to be > readable by a single CPU, which doesn't happen for shure if you don't > use specialiesed write methods and you may end up with old values. > And you also have to deal with overflow issues if the updating is > interruptable. > I don't say it's a bad idea, but you have to be selective and it still > may be the best choose for network counters. > > > There is a lot of counters and to have all of them for each processor > > would waste a bit of memory but more importantly it would require some > > structural changes - which may end up causing counters update being even > > more expensive that atomic_. Using atomic ops is easy. Even if you > > somewhere forget to use atomic_, if would still work - only with slight > > chance the value become wrong. > > As long as your architecture has atomic_ capabilities for the needed size. > The patch at http://home.eunet.cz/mime/atomic_t.patch.0201.gz (i386 only at the moment) tries to do one thing - define atomic_t type (can be counter_t or so) which is either 32 or 64 bit (64 bit only on >=586). The counters are of type atomic_t and are accessed with atomic_op_max (better name welcome - maybe atomic_op_long_plus (meaning it's at least 32 bits) functions which are rewritten to atomic_op_int or atomic_op_quad. Other architectures should only need to add to /sys/arch/include/atomic.h the defines for atomic_t and atomic_op_max. There are 3 points in patch (identifiable by 'mime XXX ??') which may be problems in original code. The line 295 at /sys/netipx/ipx_ip.c is I think bug. I tried to read the values and they seem ok. > > Anyway I'm just building world with my changes (only kernel and netstat > > modified). I'm sure I missed several places to do some changes but at > > least it runs and counts (I hope I'm wrong - I used egrep a lot). > > > > Do I understand correctly that on i386 I don't need anything special for > > atomic_XXX_{rel|acq}? I implemented only one version (with cmpxchg8b - on > > SMP with lock prepended) and others are just #defines to use the same > > function. > > What I understood from Mikes mail is that only acq/rel pairs protect > more that just the given value - others may not. > On i386 this is not an issue because it's a coherent archicture. > What about other archs? I may be i386 centric. We probably need the operations to be SMP-secure and if other arches don't do the coherency automatically Matt and others may be right that the updates could be expensive then. > > My patch will break other archs but I'll look at their atomic.h and see if > > I can make appropriate changes. > > I havn't seen one yet. > Ok ok. I wanted to make really sure it's good but above you have the URL. -- Michal Mertl mime@traveller.cz To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 10: 7:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id 9F71137B41B; Wed, 2 Jan 2002 10:07:25 -0800 (PST) Received: from pool0051.cvx40-bradley.dialup.earthlink.net ([216.244.42.51] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16LpnN-0002OT-00; Wed, 02 Jan 2002 10:07:21 -0800 Message-ID: <3C334C5A.A5E56055@mindspring.com> Date: Wed, 02 Jan 2002 10:07:22 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Stephen McKay Cc: Jordan Hubbard , Murray Stokely , John Baldwin , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> <200201021425.g02EPks11288@dungeon.home> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Stephen McKay wrote: > >It wouldn't be that hard in sysinstall either, depending on how > >the X bits are packaged. FWIW, I also think that XFree86 4.x's time > >has come. > > None of my current video cards work properly with 3.3.6, so I'm all for > adding 4.1.0 immediately, rather than post release. Any chance? Yeah. Why isn't "X" a _package_ instead of a weird-ass tarball? -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 10:55: 0 2002 Delivered-To: freebsd-arch@freebsd.org Received: from elvis.mu.org (elvis.mu.org [216.33.66.196]) by hub.freebsd.org (Postfix) with ESMTP id 12E7437B41C; Wed, 2 Jan 2002 10:54:47 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id 887B181D03; Wed, 2 Jan 2002 12:54:41 -0600 (CST) Date: Wed, 2 Jan 2002 12:54:41 -0600 From: Alfred Perlstein To: Terry Lambert Cc: Stephen McKay , Jordan Hubbard , Murray Stokely , John Baldwin , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? Message-ID: <20020102125441.Y16101@elvis.mu.org> References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> <200201021425.g02EPks11288@dungeon.home> <3C334C5A.A5E56055@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3C334C5A.A5E56055@mindspring.com>; from tlambert2@mindspring.com on Wed, Jan 02, 2002 at 10:07:22AM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Terry Lambert [020102 12:07] wrote: > Stephen McKay wrote: > > >It wouldn't be that hard in sysinstall either, depending on how > > >the X bits are packaged. FWIW, I also think that XFree86 4.x's time > > >has come. > > > > None of my current video cards work properly with 3.3.6, so I'm all for > > adding 4.1.0 immediately, rather than post release. Any chance? > > Yeah. > > Why isn't "X" a _package_ instead of a weird-ass tarball? I don't know! Maybe it's how the Xfree people distribute it? -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductable donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 11: 3:12 2002 Delivered-To: freebsd-arch@freebsd.org Received: from tao.org.uk (genius.tao.org.uk [212.135.162.51]) by hub.freebsd.org (Postfix) with ESMTP id B331237B41A; Wed, 2 Jan 2002 11:03:00 -0800 (PST) Received: by tao.org.uk (Postfix, from userid 100) id 7DC86558; Wed, 2 Jan 2002 19:02:53 +0000 (GMT) Date: Wed, 2 Jan 2002 19:02:53 +0000 From: Josef Karthauser To: Bosko Milekic Cc: freebsd-arch@freebsd.org, freebsd-smp@freebsd.org Subject: Re: SMPng: Interrupt Latency Issues Message-ID: <20020102190253.B47550@tao.org.uk> References: <20011226021800.A14608@technokratis.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-md5; protocol="application/pgp-signature"; boundary="tsOsTdHNUZQcU9Ye" Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <20011226021800.A14608@technokratis.com>; from bmilekic@technokratis.com on Wed, Dec 26, 2001 at 02:18:00AM -0500 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG --tsOsTdHNUZQcU9Ye Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Wed, Dec 26, 2001 at 02:18:00AM -0500, Bosko Milekic wrote: >=20 > Hi, >=20 > It has become obvious, recently in particular, that some important > improvements are required in the way we take interrupts in -CURRENT > SMPng. As previously mentionned, we are experiencing lousy interrupt > latency in -CURRENT. This comes as no surprise. I'm interested in helping with with, although my experience in this area is zero ;(. I was waiting with bated breath for answers to this email and am a bit disappointed that no-one's replied. I hope that doesn't mean that it's not important to anyone. (I was hoping to learn something :) My problem is that my -current box regularly dies, usually when resuming from ACPI, which I now avoid because of it. I'm sure that it's interrupt related. A noticeable characteristic is that a console bell (beep) once started never ends, suggesting that a timer has stopped working, also things like top hang and never refresh. Joe > The purpose of this Email is to get the ball rolling on a discussion > pertaining not only to the approach that we're going to take to properly > remedy the latency issues, but also on the overall ways that we will be > handeling interrupts in SMPng. I know that there are already several > ideas floating around, and briefly talking to Jake and some others over > IRC, I see that there are a couple that are more common than others. Now > I know that this is a sensitive topic but in discussing it, I would > appreciate it if all points are strictly technical and deal with the > implementation, that we not over-generalize (in other words, that we > stay `on topic' [*]), and that after the discussion is done, that we set = up > some milestones and get the work done _soon_ (I'm prepared to take up > some of the work, yes - however, I do not feel that *I* am the right > person to oversee _all_ of the technical aspects of this heavily > burdened task alone). >=20 > [*] This means please don't start bashing anything and everything in our > system and stating useless things like "well, yeah, but our VM system=20 > does not do X, Y, and Z" if it is of no _direct_ relevance to > the way we take interrupts. >=20 > With that out of the way, here's our situation: >=20 > 1. We presently take an interrupt, and in the general case, proceed to > schedule an interrupt thread to run. While placing the thread on the > run queue and because we need to check for the "SWAIT" process flag > we must acquire the sched_lock. This is where the jist of the problem > lies. We bottleneck here because it means that only one CPU can be > scheduling the interrupt thread at a time, and the rest have to spin > waiting for sched_lock. To make matters worse, interrupts are > disabled on the local CPU and we cannot take any other interrupts > either. This is where we stand.=20 >=20 > 2. BSD/OS does things differently. An interrupt comes in and in the > general case, the interrupted thread's VM address space is borrowed > and the handler is immediately executed after interrupts are > re-enabled. Then the handler(s) run and only if it happens to hit a > mutex lock for which it must wait, a clean-up is done to provide a > thread that can block on the mutex for the interrupt, and to allow > the interrupted thread to continue doing its thing. The tradeoff is > that the interrupted thread _cannot_ run anywhere else, not even on > another CPU, while the interrupt that pre-empted it is not finished > executing or has not hit a mutex and could not aquire it quickly. > This may result in some kernel thread priority rules not being 100% > obeyed but I guess that this is part of the tradeoff. >=20 > 3. I believe that some others have alternative suggestions. I encourage > them to present them here clearly, assuming that they are realistic > and implementable approaches, so that we are sure to make the right > decision before we setup milestones. >=20 > In particular, I know that, after briefly speaking to Jake, there is > the idea of per-CPU "interrupt-only run queues" floating around. The > jist of this method would be to keep per-CPU run queues to which each > CPU can schedule their own interrupt threads without having to acquire > any locks (i.e., only have interrupts disabled). The un-balancing of the > queues as well as special priority cases - should they arise - could be > handeled by issuing IPIs. >=20 > I also know that some others (notably Rik van Riel - I hope I spelled > that correctly :-) ) have mentionned realistic ideas that are quite > logical in light of what we presently have and the points above. To me, > following a very brief overview, some of them shared some of the > qualities of the BSD/OS way of doing it so I'd like to invite him (and > others) to lay those methods out for us now, so that we have a greater > picture of various alternatives. >=20 > That's all for now. I hope that we can agree on something worthwhile > soon so that we can establish clear milestones and get this thing done. > It's been long overdue. >=20 > Best regards and Seasons Greetings to you all, > --=20 > Bosko Milekic > bmilekic@FreeBSD.org >=20 >=20 > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message --tsOsTdHNUZQcU9Ye Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (FreeBSD) Comment: For info see http://www.gnupg.org iEYEARECAAYFAjwzWVwACgkQXVIcjOaxUBbqfwCfUiOchBvdRxYpj2+r6jD8bJ5D w4oAoKSs7R/XPJxsxYml+bUV7F1Kh/fJ =MOw6 -----END PGP SIGNATURE----- --tsOsTdHNUZQcU9Ye-- To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 12:33:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 3FB3D37B41B; Wed, 2 Jan 2002 12:33:30 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g02KXNV59224; Wed, 2 Jan 2002 12:33:23 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 12:33:23 -0800 (PST) From: Matthew Dillon Message-Id: <200201022033.g02KXNV59224@apollo.backplane.com> To: Michal Mertl Cc: Bruce Evans , Mike Smith , Bernd Walter , Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG : :I don't know how much time will be wasted - my measurements on pII show :the atomic_ operations aren't that expensive. An atomic operation is not that expensive as long as only one cpu is touching the cache line. Try running two user processes writing the same cache line on an SMP system and you will see performance drop by a factor of 5-10. Now, of course, we could lock each interface to a particular cpu. There are advantages to doing that. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 12:38: 2 2002 Delivered-To: freebsd-arch@freebsd.org Received: from hawk.prod.itd.earthlink.net (hawk.mail.pas.earthlink.net [207.217.120.22]) by hub.freebsd.org (Postfix) with ESMTP id 1301C37B50F; Wed, 2 Jan 2002 12:37:44 -0800 (PST) Received: from pool0051.cvx40-bradley.dialup.earthlink.net ([216.244.42.51] helo=mindspring.com) by hawk.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16Ls8n-00032l-00; Wed, 02 Jan 2002 12:37:38 -0800 Message-ID: <3C336F92.E09BDF43@mindspring.com> Date: Wed, 02 Jan 2002 12:37:38 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Alfred Perlstein Cc: Stephen McKay , Jordan Hubbard , Murray Stokely , John Baldwin , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> <200201021425.g02EPks11288@dungeon.home> <3C334C5A.A5E56055@mindspring.com> <20020102125441.Y16101@elvis.mu.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Alfred Perlstein wrote: > > Why isn't "X" a _package_ instead of a weird-ass tarball? > > I don't know! Maybe it's how the Xfree people distribute it? I was under the impression that it was compiled for distribution against the FreeBSD version being distributed, by the FreeBSD CDROM builders, not by the XFree people. ...not that I'm not grateful that it's not a "split tarball", like bin, compat1x, compat21, compat22, compat3x, crypto, dict, doc, games, info, manpages, proflibs, sbase, sbin, scontrib, setc, sgames, sgnu, slib, slibexec, srelease, ssbin, sshare, ssys, stools (amazingly funny name 8-)), subin, susbin. 8-) It just seems to me that the choice of X server is a bit more locked in than it should be, and that arguing over "which one" assumes that you can only have one on the ditribution media, which doesn't strike me as the correct argument to be having. --Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 12:43:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from hawk.prod.itd.earthlink.net (hawk.mail.pas.earthlink.net [207.217.120.22]) by hub.freebsd.org (Postfix) with ESMTP id 3F23E37B417; Wed, 2 Jan 2002 12:43:50 -0800 (PST) Received: from pool0051.cvx40-bradley.dialup.earthlink.net ([216.244.42.51] helo=mindspring.com) by hawk.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16LsEP-0003K8-00; Wed, 02 Jan 2002 12:43:25 -0800 Message-ID: <3C3370EE.7621E667@mindspring.com> Date: Wed, 02 Jan 2002 12:43:26 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201022033.g02KXNV59224@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > :I don't know how much time will be wasted - my measurements on pII show > :the atomic_ operations aren't that expensive. > > An atomic operation is not that expensive as long as only one cpu > is touching the cache line. Try running two user processes writing > the same cache line on an SMP system and you will see performance > drop by a factor of 5-10. Matt is correct (if in doubt, Matt is usually correct 8-)). The only way to address this decently is to reduce contention domains in the design. Initially, this won't happen; I expect that it will have to come subsystem by subsystem at a later date, as people hit performance bottlenecks. > Now, of course, we could lock each interface to a particular cpu. There > are advantages to doing that. This is something that should at least be permitted by any final design, if not actively encouraged, or implemented up front. This is what NT did to beat Linux on the MindCraft and other benchmarks. There are also some serious penalties associated with the way FreeBSD does it now (interrupts are wired to the sole CPU in the kernel) and the way Intel wants it done (virtual wire mode means every CPU gets an IPI), and both of these limit scalability significantly. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 12:54:58 2002 Delivered-To: freebsd-arch@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 885) id 2983C37B41B; Wed, 2 Jan 2002 12:54:55 -0800 (PST) Date: Wed, 2 Jan 2002 12:54:55 -0800 From: Eric Melville To: Terry Lambert Cc: Stephen McKay , Jordan Hubbard , Murray Stokely , John Baldwin , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? Message-ID: <20020102125455.A44310@FreeBSD.org> References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> <200201021425.g02EPks11288@dungeon.home> <3C334C5A.A5E56055@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3C334C5A.A5E56055@mindspring.com>; from tlambert2@mindspring.com on Wed, Jan 02, 2002 at 10:07:22AM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > Why isn't "X" a _package_ instead of a weird-ass tarball? If everything goes according to plan, all of FreeBSD will be in packages for 5.0-RELEASE. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 13: 4:38 2002 Delivered-To: freebsd-arch@freebsd.org Received: from herbelot.dyndns.org (d211.dhcp212-26.cybercable.fr [212.198.26.211]) by hub.freebsd.org (Postfix) with ESMTP id A39F137B419; Wed, 2 Jan 2002 13:04:35 -0800 (PST) Received: from herbelot.com (multi.herbelot.nom [192.168.1.2]) by herbelot.dyndns.org (8.9.3/8.9.3) with ESMTP id WAA51276; Wed, 2 Jan 2002 22:04:29 +0100 (CET) (envelope-from thierry@herbelot.com) Message-ID: <3C3375DD.36947896@herbelot.com> Date: Wed, 02 Jan 2002 22:04:29 +0100 From: Thierry Herbelot X-Mailer: Mozilla 4.79 [en] (X11; U; Linux 2.4.2 i386) X-Accept-Language: en MIME-Version: 1.0 To: Eric Melville Cc: arch@FreeBSD.ORG Subject: Re: xfree4 by default? References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> <200201021425.g02EPks11288@dungeon.home> <3C334C5A.A5E56055@mindspring.com> <20020102125455.A44310@FreeBSD.org> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG [huge CC- trimmed] Eric Melville wrote: > > If everything goes according to plan, all of FreeBSD will be in packages > for 5.0-RELEASE. > Hello, Was this new packaging discussed on some mailing list ? (I may have missed, but I neither saw it on -arch or -current) Cheers -- Thierry Herbelot To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 13:34:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: from winston.freebsd.org (adsl-64-173-15-98.dsl.sntc01.pacbell.net [64.173.15.98]) by hub.freebsd.org (Postfix) with ESMTP id CC37E37B419; Wed, 2 Jan 2002 13:34:16 -0800 (PST) Received: from winston.freebsd.org (jkh@localhost [127.0.0.1]) by winston.freebsd.org (8.11.6/8.11.6) with ESMTP id g02LY4E88163; Wed, 2 Jan 2002 13:34:04 -0800 (PST) (envelope-from jkh@winston.freebsd.org) To: Stephen McKay Cc: Murray Stokely , John Baldwin , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG Subject: Re: xfree4 by default? In-Reply-To: Message from Stephen McKay of "Thu, 03 Jan 2002 00:25:46 +1000." <200201021425.g02EPks11288@dungeon.home> Date: Wed, 02 Jan 2002 13:34:04 -0800 Message-ID: <88159.1010007244@winston.freebsd.org> From: Jordan Hubbard Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Various things have to be coordinated in parallel for this to work seamlessly. How ready is the ports/package team ready to do a complete cut-over for the affected branch? - Jordan > On Tuesday, 1st January 2002, Jordan Hubbard wrote: > > >It wouldn't be that hard in sysinstall either, depending on how > >the X bits are packaged. FWIW, I also think that XFree86 4.x's time > >has come. > > None of my current video cards work properly with 3.3.6, so I'm all for > adding 4.1.0 immediately, rather than post release. Any chance? > > Stephen. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 13:48:22 2002 Delivered-To: freebsd-arch@freebsd.org Received: by hub.freebsd.org (Postfix, from userid 885) id C87E337B405; Wed, 2 Jan 2002 13:48:20 -0800 (PST) Date: Wed, 2 Jan 2002 13:48:20 -0800 From: Eric Melville To: Thierry Herbelot Cc: arch@FreeBSD.ORG Subject: Re: xfree4 by default? Message-ID: <20020102134820.B44310@FreeBSD.org> References: <20011231183928.F37696@bsd.havk.org> <78699.1009878927@winston.freebsd.org> <200201021425.g02EPks11288@dungeon.home> <3C334C5A.A5E56055@mindspring.com> <20020102125455.A44310@FreeBSD.org> <3C3375DD.36947896@herbelot.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3C3375DD.36947896@herbelot.com>; from thierry@herbelot.com on Wed, Jan 02, 2002 at 10:04:29PM +0100 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG > Was this new packaging discussed on some mailing list ? > (I may have missed, but I neither saw it on -arch or -current) I seem to talk about it on the binup mailing list, mostly because shipping everything in packages is the first logical step to really slick binary updates. Soon I'll have a summary and such for the wider audience. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 14:28:53 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail11.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by hub.freebsd.org (Postfix) with ESMTP id 81A7137B431 for ; Wed, 2 Jan 2002 14:27:45 -0800 (PST) Received: (qmail 15843 invoked from network); 2 Jan 2002 22:27:44 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 2 Jan 2002 22:27:44 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <88159.1010007244@winston.freebsd.org> Date: Wed, 02 Jan 2002 14:27:31 -0800 (PST) From: John Baldwin To: Jordan Hubbard Subject: Re: xfree4 by default? Cc: re@FreeBSD.ORG, arch@FreeBSD.ORG, Alfred Perlstein , Murray Stokely , Stephen McKay Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 02-Jan-02 Jordan Hubbard wrote: > Various things have to be coordinated in parallel for this to work > seamlessly. How ready is the ports/package team ready to do a complete > cut-over for the affected branch? I think it's a bit late to do this for 4.5. We are already in code freeze, and I'd rather wait until after 4.5 to take on a change of this magnitude. > - Jordan > >> On Tuesday, 1st January 2002, Jordan Hubbard wrote: >> >> >It wouldn't be that hard in sysinstall either, depending on how >> >the X bits are packaged. FWIW, I also think that XFree86 4.x's time >> >has come. >> >> None of my current video cards work properly with 3.3.6, so I'm all for >> adding 4.1.0 immediately, rather than post release. Any chance? >> >> Stephen. > -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 14:49:29 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 9098737B419; Wed, 2 Jan 2002 14:49:25 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g02MnEK03680; Wed, 2 Jan 2002 23:49:14 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g02MnZtx045513; Wed, 2 Jan 2002 23:49:35 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g02MnZW12823; Wed, 2 Jan 2002 23:49:35 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g02MnVI52953; Wed, 2 Jan 2002 23:49:31 +0100 (CET) (envelope-from ticso) Date: Wed, 2 Jan 2002 23:49:31 +0100 From: Bernd Walter To: Matthew Dillon Cc: Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020102224931.GC10762@cicely9.cicely.de> References: <200201022033.g02KXNV59224@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201022033.g02KXNV59224@apollo.backplane.com> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 12:33:23PM -0800, Matthew Dillon wrote: > : > :I don't know how much time will be wasted - my measurements on pII show > :the atomic_ operations aren't that expensive. > > An atomic operation is not that expensive as long as only one cpu > is touching the cache line. Try running two user processes writing > the same cache line on an SMP system and you will see performance > drop by a factor of 5-10. Just to understand: Your intend is not to use per CPU variables instead of atomic_ functions, but to use atomic_ and per CPU to not clash cache lines. That way it sounds logicaly correct and makes sense for me. Is there a standart mechanism to allocate per CPU memory? It wouldn't make sense if all variables still end up in the same cache line. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 14:57:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from netau1.alcanet.com.au (ntp.alcanet.com.au [203.62.196.27]) by hub.freebsd.org (Postfix) with ESMTP id 2A77537B41B; Wed, 2 Jan 2002 14:57:32 -0800 (PST) Received: from mfg1.cim.alcatel.com.au (mfg1.cim.alcatel.com.au [139.188.23.1]) by netau1.alcanet.com.au (8.9.3 (PHNE_22672)/8.9.3) with ESMTP id JAA23485; Thu, 3 Jan 2002 09:57:05 +1100 (EDT) Received: from gsmx07.alcatel.com.au by cim.alcatel.com.au (PMDF V5.2-32 #37641) with ESMTP id <01KCMPBDRB9CVFM2DO@cim.alcatel.com.au>; Thu, 3 Jan 2002 09:56:12 +1100 Received: (from jeremyp@localhost) by gsmx07.alcatel.com.au (8.11.6/8.11.6) id g02Mv2c00921; Thu, 03 Jan 2002 09:57:02 +1100 Content-return: prohibited Date: Thu, 03 Jan 2002 09:57:02 +1100 From: Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-reply-to: ; from mime@traveller.cz on Wed, Jan 02, 2002 at 03:53:55PM +0100 To: Michal Mertl Cc: Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Mail-Followup-To: Michal Mertl , Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Message-id: <20020103095701.B561@gsmx07.alcatel.com.au> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline User-Agent: Mutt/1.2.5i References: <200201012349.g01NnKA40071@apollo.backplane.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 2002-Jan-02 15:53:55 +0100, Michal Mertl wrote: >I don't know how much time will be wasted - my measurements on pII show >the atomic_ operations aren't that expensive. As Matt has pointed out, this is only true if you have a single processor. Atomic operations always translate into bus cycles - and the bus is roughly an order of magnitude slower than the CPU core for current CPUs. The worst situation is where a common counter is updated by a random CPU - the counter will virtually always be in another CPU's cache, requiring multiple bus cycles to transfer the data. Also, many RISC processors (eg Alpha) don't have locked read-modify- write primitives. On the Alpha, you need an instruction sequence: loop: load_locked memory->register update register store_conditional register->memory if not success goto loop with a few memory barriers added to ensure that the load/store are visible to other CPUs. The store_conditional will fail if your CPU was interrupted or if another CPU updated an implementation-defined region including the specified memory address. (64-bit atomic operations on the IA32 use the same approach - using CMPXCHG8B as the store_conditional instruction). This approach is quite expensive when you have multiple CPU's contending for the same resource. >There is a lot of counters and to have all of them for each processor >would waste a bit of memory I would be surprised if it was more than a page or two per CPU - which is trivial in the overall scheme of things. > but more importantly it would require some structural changes - >which may end up causing counters update being even more expensive >that atomic_. This depends on how it is implemented. Obviously int counter[NCPUS]; will be just as expensive as performing atomic operations, but no-one in their right mind would do that. One approach is to aggregate all the per-CPU counters into a single region of KVM and arrange for that KVM to be mapped to different physical memory for each CPU. (Solaris does or did this). This means that the code to update the counter doesn't need to know whether a counter is per-CPU or not. The code to read the counters _does_ need to know that the counters are per-CPU and have to sum all the individual counters - which is more expensive than a straight read, but is normally far less frequent. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 15:23:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 12CA137B419; Wed, 2 Jan 2002 15:23:54 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g02NNjt60197; Wed, 2 Jan 2002 15:23:45 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 15:23:45 -0800 (PST) From: Matthew Dillon Message-Id: <200201022323.g02NNjt60197@apollo.backplane.com> To: Peter Jeremy Cc: Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201012349.g01NnKA40071@apollo.backplane.com> <20020103095701.B561@gsmx07.alcatel.com.au> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :This depends on how it is implemented. Obviously : int counter[NCPUS]; :will be just as expensive as performing atomic operations, but no-one :in their right mind would do that. One approach is to aggregate all :the per-CPU counters into a single region of KVM and arrange for that :KVM to be mapped to different physical memory for each CPU. (Solaris :does or did this). This means that the code to update the counter :doesn't need to know whether a counter is per-CPU or not. : :The code to read the counters _does_ need to know that the counters :are per-CPU and have to sum all the individual counters - which is :more expensive than a straight read, but is normally far less frequent. : :Peter Something like galloc()/gfree(). offset = galloc(bytes); /* allocate space in all cpu's per-cpu struct*/ gfree(offset, bytes); /* return previously reserved space */ And then macros to read and write it. global_int(offset) /* returns address of global int @ offset */ global_quad(offset) /* returns address of global int @ offset */ e.g. ++*global_quad(ifc->counter_off); Which GCC ought to be able to optimize fairly easily. This isn't a recommendation, just one way we could do it. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 15:29:40 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by hub.freebsd.org (Postfix) with ESMTP id D09BD37B405 for ; Wed, 2 Jan 2002 15:29:35 -0800 (PST) Received: (qmail 3644 invoked from network); 2 Jan 2002 23:29:34 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 2 Jan 2002 23:29:34 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201022323.g02NNjt60197@apollo.backplane.com> Date: Wed, 02 Jan 2002 15:29:20 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 02-Jan-02 Matthew Dillon wrote: > >:This depends on how it is implemented. Obviously >: int counter[NCPUS]; >:will be just as expensive as performing atomic operations, but no-one >:in their right mind would do that. One approach is to aggregate all >:the per-CPU counters into a single region of KVM and arrange for that >:KVM to be mapped to different physical memory for each CPU. (Solaris >:does or did this). This means that the code to update the counter >:doesn't need to know whether a counter is per-CPU or not. >: >:The code to read the counters _does_ need to know that the counters >:are per-CPU and have to sum all the individual counters - which is >:more expensive than a straight read, but is normally far less frequent. >: >:Peter > > Something like galloc()/gfree(). > > offset = galloc(bytes); /* allocate space in all cpu's per-cpu struct*/ > gfree(offset, bytes); /* return previously reserved space */ > > And then macros to read and write it. > > global_int(offset) /* returns address of global int @ offset */ > global_quad(offset) /* returns address of global int @ offset */ > > e.g. > > ++*global_quad(ifc->counter_off); > > Which GCC ought to be able to optimize fairly easily. > > This isn't a recommendation, just one way we could do it. Look at PCPU_GET/PCPU_SET. Note that since an interrupt can preempt you and push you off onto another CPU, you have to use a critical section while updating per-CPU variables. If desired, some kind of free area could be stuck in struct pcpu (or more likely, struct pcpu would hold a pointer to the area) that could be galloc/gfree'd or some such. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 15:40:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from monorchid.lemis.com (monorchid.lemis.com [192.109.197.75]) by hub.freebsd.org (Postfix) with ESMTP id 843BF37B41E; Wed, 2 Jan 2002 15:40:21 -0800 (PST) Received: by monorchid.lemis.com (Postfix, from userid 1004) id 8FBF078313; Thu, 3 Jan 2002 10:10:18 +1030 (CST) Date: Thu, 3 Jan 2002 10:10:18 +1030 From: Greg Lehey To: Josef Karthauser Cc: Steve Price , Murray Stokely , John Baldwin , Alfred Perlstein , arch@freebsd.org, re@freebsd.org Subject: Re: xfree4 by default? Message-ID: <20020103101018.D12254@monorchid.lemis.com> References: <20011231174113.O16101@elvis.mu.org> <20011231162222.V2286@windriver.com> <20011231183928.F37696@bsd.havk.org> <20020102144957.J4619@tao.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020102144957.J4619@tao.org.uk> User-Agent: Mutt/1.3.23i Organization: The FreeBSD Project Phone: +61-8-8388-8286 Fax: +61-8-8388-8725 Mobile: +61-418-838-708 WWW-Home-Page: http://www.FreeBSD.org/ X-PGP-Fingerprint: 6B 7B C3 8C 61 CD 54 AF 13 24 52 F8 6D A4 95 EF Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wednesday, 2 January 2002 at 14:49:57 +0000, Josef Karthauser wrote: > On Mon, Dec 31, 2001 at 06:39:28PM -0600, Steve Price wrote: >> On Mon, Dec 31, 2001 at 04:22:22PM -0800, Murray Stokely wrote: >>> >>> I agree that we should make the switch in -STABLE immediately after >>> FreeBSD 4.5 is released. Every other x86 Unix I've been exposed to >>> has been using X4 for over a year. The arguments about older >>> supported hardware do not hold water any more since X4 supports so >>> many newer chipsets out of the box that X336 can't handle. >> >> All of the recent package builds have been with XFREE86_VERSION=4. >> I'm not sure how long for sure but for as long as I can remember >> which isn't saying much. Making it the default shouldn't be all >> that difficult in bsd.port.mk. Don't know how hard it would be >> for sysinstall and friends. > > What's the best way to use it? We appear to have a monolithic port > XFree86-4, and then -clients, -documents, -libraries, -manuals, imake, > etc. The dependancies don't appear to work too well between them, when > using portupgrade. I'm all for the change, but I'd like to see some direction pretty soon: I'm about to hand in the final manuscripts for "The Complete FreeBSD", 4th edition, and I'd like to be able to describe it there. Does XFree86 4 have a useful configuration utility yet? Greg -- See complete headers for address and phone numbers To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 15:41:21 2002 Delivered-To: freebsd-arch@freebsd.org Received: from winston.freebsd.org (adsl-64-173-15-98.dsl.sntc01.pacbell.net [64.173.15.98]) by hub.freebsd.org (Postfix) with ESMTP id 6BB7237B417; Wed, 2 Jan 2002 15:41:10 -0800 (PST) Received: from winston.freebsd.org (jkh@localhost [127.0.0.1]) by winston.freebsd.org (8.11.6/8.11.6) with ESMTP id g02Nf5E88564; Wed, 2 Jan 2002 15:41:06 -0800 (PST) (envelope-from jkh@winston.freebsd.org) To: John Baldwin Cc: re@FreeBSD.ORG, arch@FreeBSD.ORG, Alfred Perlstein , Murray Stokely , Stephen McKay Subject: Re: xfree4 by default? In-Reply-To: Message from John Baldwin of "Wed, 02 Jan 2002 14:27:31 PST." Date: Wed, 02 Jan 2002 15:41:05 -0800 Message-ID: <88560.1010014865@winston.freebsd.org> From: Jordan Hubbard Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Sorry, we weren't talking about 4.5, we were talking about satisfying the "let's cut -stable over immediately after release" request. - Jordan > > On 02-Jan-02 Jordan Hubbard wrote: > > Various things have to be coordinated in parallel for this to work > > seamlessly. How ready is the ports/package team ready to do a complete > > cut-over for the affected branch? > > I think it's a bit late to do this for 4.5. We are already in code freeze, a nd > I'd rather wait until after 4.5 to take on a change of this magnitude. > > > - Jordan > > > >> On Tuesday, 1st January 2002, Jordan Hubbard wrote: > >> > >> >It wouldn't be that hard in sysinstall either, depending on how > >> >the X bits are packaged. FWIW, I also think that XFree86 4.x's time > >> >has come. > >> > >> None of my current video cards work properly with 3.3.6, so I'm all for > >> adding 4.1.0 immediately, rather than post release. Any chance? > >> > >> Stephen. > > > > -- > > John Baldwin <>< http://www.FreeBSD.org/~jhb/ > "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16: 2:18 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id A681537B41A; Wed, 2 Jan 2002 16:02:15 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g0302Eo60575; Wed, 2 Jan 2002 16:02:14 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 16:02:14 -0800 (PST) From: Matthew Dillon Message-Id: <200201030002.g0302Eo60575@apollo.backplane.com> To: John Baldwin Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :Look at PCPU_GET/PCPU_SET. Note that since an interrupt can preempt you and :push you off onto another CPU, you have to use a critical section while :updating per-CPU variables. If desired, some kind of free area could be stuck :in struct pcpu (or more likely, struct pcpu would hold a pointer to the area) :that could be galloc/gfree'd or some such. : :-- : :John Baldwin <>< http://www.FreeBSD.org/~jhb/ :"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ Maybe we are going about this all wrong. If a particular interface counter can only be modified from the device interrupt, or only be modified while holding the appropriate mutex, do we need any locking at all? -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16: 7:42 2002 Delivered-To: freebsd-arch@freebsd.org Received: from elvis.mu.org (elvis.mu.org [216.33.66.196]) by hub.freebsd.org (Postfix) with ESMTP id 0A24E37B41C; Wed, 2 Jan 2002 16:07:40 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id 7958C81D03; Wed, 2 Jan 2002 18:07:34 -0600 (CST) Date: Wed, 2 Jan 2002 18:07:34 -0600 From: Alfred Perlstein To: Matthew Dillon Cc: John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020102180734.A82406@elvis.mu.org> References: <200201030002.g0302Eo60575@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <200201030002.g0302Eo60575@apollo.backplane.com>; from dillon@apollo.backplane.com on Wed, Jan 02, 2002 at 04:02:14PM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Matthew Dillon [020102 18:02] wrote: > :Look at PCPU_GET/PCPU_SET. Note that since an interrupt can preempt you and > :push you off onto another CPU, you have to use a critical section while > :updating per-CPU variables. If desired, some kind of free area could be stuck > :in struct pcpu (or more likely, struct pcpu would hold a pointer to the area) > :that could be galloc/gfree'd or some such. > : > :-- > : > :John Baldwin <>< http://www.FreeBSD.org/~jhb/ > :"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ > > Maybe we are going about this all wrong. If a particular interface > counter can only be modified from the device interrupt, or only be > modified while holding the appropriate mutex, do we need any locking > at all? Yes against the collector unless the collector is run periodically on each cpu to collect the stats. -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductable donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:13: 0 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 8524937B41E; Wed, 2 Jan 2002 16:12:56 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g030Cgp60752; Wed, 2 Jan 2002 16:12:42 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 16:12:42 -0800 (PST) From: Matthew Dillon Message-Id: <200201030012.g030Cgp60752@apollo.backplane.com> To: Alfred Perlstein Cc: John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> Maybe we are going about this all wrong. If a particular interface :> counter can only be modified from the device interrupt, or only be :> modified while holding the appropriate mutex, do we need any locking :> at all? : :Yes against the collector unless the collector is run periodically :on each cpu to collect the stats. : :-- :-Alfred Perlstein [alfred@freebsd.org] If we standardize the mutex used by interface device-driver interrupts (if it isn't already done), the collector could obtain the mutex when reading the counter, yes? -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:13:47 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by hub.freebsd.org (Postfix) with ESMTP id 4352437B420 for ; Wed, 2 Jan 2002 16:13:14 -0800 (PST) Received: (qmail 3785 invoked from network); 3 Jan 2002 00:13:13 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 00:13:13 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201030002.g0302Eo60575@apollo.backplane.com> Date: Wed, 02 Jan 2002 16:13:00 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Matthew Dillon wrote: >:Look at PCPU_GET/PCPU_SET. Note that since an interrupt can preempt you and >:push you off onto another CPU, you have to use a critical section while >:updating per-CPU variables. If desired, some kind of free area could be >:stuck >:in struct pcpu (or more likely, struct pcpu would hold a pointer to the area) >:that could be galloc/gfree'd or some such. >: >:-- >: >:John Baldwin <>< http://www.FreeBSD.org/~jhb/ >:"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ > > Maybe we are going about this all wrong. If a particular interface > counter can only be modified from the device interrupt, or only be > modified while holding the appropriate mutex, do we need any locking > at all? Note that critical sections don't impose locking, right now they just disable interrupts on the local CPU. Eventually they will also prevent preemptions for any setrunqueue's done inside a critical section and defer the switches until the critical section is exited. If you pin processes/threads to CPU's when they get interrupted so they resume on the same CPU and only migrate at setrunqueue(), then you still might need to disable interrupts if your update of a per-CPU variable isn't atomic since when you return to the thread, it might do a modify-write of a stale variable. Think of an interrupt handler being interrupted by another interrupt. Thus, I think it would still be wise to disable interrupts for per-CPU stuff. At least, for ones that can be modified by interrupt handlers. Also, per-thread counters don't need locking. > -Matt > Matthew Dillon > -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:19:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail.hiwaay.net (fly.HiWAAY.net [208.147.154.56]) by hub.freebsd.org (Postfix) with ESMTP id 18C5237B419; Wed, 2 Jan 2002 16:19:33 -0800 (PST) Received: from bsd.havk.org (user-24-214-88-13.knology.net [24.214.88.13]) by mail.hiwaay.net (8.12.1/8.12.1) with ESMTP id g030JSBZ024283; Wed, 2 Jan 2002 18:19:29 -0600 (CST) Received: by bsd.havk.org (Postfix, from userid 1001) id B8B401A786; Wed, 2 Jan 2002 18:19:26 -0600 (CST) Date: Wed, 2 Jan 2002 18:19:26 -0600 From: Steve Price To: John Baldwin Cc: Jordan Hubbard , re@FreeBSD.ORG, arch@FreeBSD.ORG, Alfred Perlstein , Murray Stokely , Stephen McKay Subject: Re: xfree4 by default? Message-ID: <20020102181926.M37696@bsd.havk.org> References: <88159.1010007244@winston.freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from jhb@FreeBSD.org on Wed, Jan 02, 2002 at 02:27:31PM -0800 X-Operating-System: FreeBSD 4.5-PRERELEASE i386 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 02:27:31PM -0800, John Baldwin wrote: > > On 02-Jan-02 Jordan Hubbard wrote: > > Various things have to be coordinated in parallel for this to work > > seamlessly. How ready is the ports/package team ready to do a complete > > cut-over for the affected branch? > > I think it's a bit late to do this for 4.5. Actually as I said earlier the package builds have been with XF4 for some time now. I had to put it back to using XF3 for the release. -steve To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:24:48 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 2D93237B41C; Wed, 2 Jan 2002 16:24:45 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g030Oip60860; Wed, 2 Jan 2002 16:24:44 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 16:24:44 -0800 (PST) From: Matthew Dillon Message-Id: <200201030024.g030Oip60860@apollo.backplane.com> To: John Baldwin Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :Note that critical sections don't impose locking, right now they just disable :interrupts on the local CPU. Eventually they will also prevent preemptions for :any setrunqueue's done inside a critical section and defer the switches until :the critical section is exited. If you pin processes/threads to CPU's when :they get interrupted so they resume on the same CPU and only migrate at :setrunqueue(), then you still might need to disable interrupts if your update :of a per-CPU variable isn't atomic since when you return to the thread, it :might do a modify-write of a stale variable. Think of an interrupt handler :being interrupted by another interrupt. Thus, I think it would still be wise :to disable interrupts for per-CPU stuff. At least, for ones that can be :modified by interrupt handlers. Also, per-thread counters don't need locking. But if it is protected by a mutex, and an interrupt occurs while you hold the mutex, the interrupt thread will not be able to run (or at least will wind up blocking while getting the mutex) until you release your mutex, at which point your modifications have been synchronized out (releasing the mutex ensures this). The critical section stuff would be more palettable if it weren't so expensive. Couldn't we just have a per-cpu critical section count and defer the interrupt? (e.g. like the deferred mechanism we used for spl()s). Then we would have an incredibly cheap mechanism for accessing per-cpu caches (like per-cpu mbuf freelists, for example) which could further be adapted for use by zalloc[i]() and malloc(). I am really beginning to hate not being able to depend on anything without having to make expensive calls first. sti and cli are very expensive instructions. (We are really talking about two different ways to increment a counter here... one way using a mutex for protection, the other way using a critical section and per-cpu space). -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:29:40 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id F1AD837B41E; Wed, 2 Jan 2002 16:29:36 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g030TQO05744; Thu, 3 Jan 2002 01:29:27 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g030PNtx046147; Thu, 3 Jan 2002 01:25:24 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g030PNW12902; Thu, 3 Jan 2002 01:25:23 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g030PM162274; Thu, 3 Jan 2002 01:25:22 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 01:25:22 +0100 From: Bernd Walter To: Michal Mertl , Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103002521.GB53199@cicely9.cicely.de> References: <200201012349.g01NnKA40071@apollo.backplane.com> <20020103095701.B561@gsmx07.alcatel.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020103095701.B561@gsmx07.alcatel.com.au> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 09:57:02AM +1100, Peter Jeremy wrote: > On 2002-Jan-02 15:53:55 +0100, Michal Mertl wrote: > >I don't know how much time will be wasted - my measurements on pII show > >the atomic_ operations aren't that expensive. > > As Matt has pointed out, this is only true if you have a single > processor. Atomic operations always translate into bus cycles - and > the bus is roughly an order of magnitude slower than the CPU core for > current CPUs. The worst situation is where a common counter is > updated by a random CPU - the counter will virtually always be in > another CPU's cache, requiring multiple bus cycles to transfer the > data. > > Also, many RISC processors (eg Alpha) don't have locked read-modify- > write primitives. On the Alpha, you need an instruction sequence: > loop: load_locked memory->register > update register > store_conditional register->memory > if not success goto loop > with a few memory barriers added to ensure that the load/store are > visible to other CPUs. The store_conditional will fail if your CPU > was interrupted or if another CPU updated an implementation-defined > region including the specified memory address. (64-bit atomic > operations on the IA32 use the same approach - using CMPXCHG8B as the > store_conditional instruction). My Alpha Architecture Handbook says that the barrier is unneeded. I have no clue why they are there. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:34:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id A6A7D37B41B; Wed, 2 Jan 2002 16:34:32 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g030YHc06069; Thu, 3 Jan 2002 01:34:17 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g030WGtx046285; Thu, 3 Jan 2002 01:32:16 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g030WFW12917; Thu, 3 Jan 2002 01:32:15 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g030WFG62289; Thu, 3 Jan 2002 01:32:15 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 01:32:15 +0100 From: Bernd Walter To: Matthew Dillon Cc: John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103003214.GC53199@cicely9.cicely.de> References: <200201030002.g0302Eo60575@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201030002.g0302Eo60575@apollo.backplane.com> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 04:02:14PM -0800, Matthew Dillon wrote: > :Look at PCPU_GET/PCPU_SET. Note that since an interrupt can preempt you and > :push you off onto another CPU, you have to use a critical section while > :updating per-CPU variables. If desired, some kind of free area could be stuck > :in struct pcpu (or more likely, struct pcpu would hold a pointer to the area) > :that could be galloc/gfree'd or some such. > : > :-- > : > :John Baldwin <>< http://www.FreeBSD.org/~jhb/ > :"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ > > Maybe we are going about this all wrong. If a particular interface > counter can only be modified from the device interrupt, or only be > modified while holding the appropriate mutex, do we need any locking > at all? You need to hold the mutex while writing and reading. If you hold the mutex only while writing another CPU might still use old cached values. The same goes for device interrrupt as it doesn't enshure inter CPU consistency. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:39:47 2002 Delivered-To: freebsd-arch@freebsd.org Received: from netau1.alcanet.com.au (ntp.alcanet.com.au [203.62.196.27]) by hub.freebsd.org (Postfix) with ESMTP id 0CCA637B41A; Wed, 2 Jan 2002 16:39:43 -0800 (PST) Received: from mfg1.cim.alcatel.com.au (mfg1.cim.alcatel.com.au [139.188.23.1]) by netau1.alcanet.com.au (8.9.3 (PHNE_22672)/8.9.3) with ESMTP id LAA04333; Thu, 3 Jan 2002 11:39:23 +1100 (EDT) Received: from gsmx07.alcatel.com.au by cim.alcatel.com.au (PMDF V5.2-32 #37641) with ESMTP id <01KCMSW7KKS0VFJIUZ@cim.alcatel.com.au>; Thu, 3 Jan 2002 11:38:30 +1100 Received: (from jeremyp@localhost) by gsmx07.alcatel.com.au (8.11.6/8.11.6) id g030dKo01470; Thu, 03 Jan 2002 11:39:20 +1100 Content-return: prohibited Date: Thu, 03 Jan 2002 11:39:20 +1100 From: Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-reply-to: <20020103002521.GB53199@cicely9.cicely.de>; from ticso@cicely9.cicely.de on Thu, Jan 03, 2002 at 01:25:22AM +0100 To: Bernd Walter Cc: Michal Mertl , Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Mail-Followup-To: Bernd Walter , Michal Mertl , Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Message-id: <20020103113919.E561@gsmx07.alcatel.com.au> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline User-Agent: Mutt/1.2.5i References: <200201012349.g01NnKA40071@apollo.backplane.com> <20020103095701.B561@gsmx07.alcatel.com.au> <20020103002521.GB53199@cicely9.cicely.de> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 2002-Jan-03 01:25:22 +0100, Bernd Walter wrote: >On Thu, Jan 03, 2002 at 09:57:02AM +1100, Peter Jeremy wrote: >> Also, many RISC processors (eg Alpha) don't have locked read-modify- >> write primitives. On the Alpha, you need an instruction sequence: >> loop: load_locked memory->register >> update register >> store_conditional register->memory >> if not success goto loop >> with a few memory barriers added to ensure that the load/store are >> visible to other CPUs. The store_conditional will fail if your CPU >> was interrupted or if another CPU updated an implementation-defined >> region including the specified memory address. (64-bit atomic >> operations on the IA32 use the same approach - using CMPXCHG8B as the >> store_conditional instruction). > >My Alpha Architecture Handbook says that the barrier is unneeded. >I have no clue why they are there. You're right. Version 2, section 5.5 shows that they aren't needed when a single datum is being atomically updated (as needed here). They're only needed where the atomic operation is seizing a lock so that a larger structure can be atomically updated. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 16:43:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail12.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by hub.freebsd.org (Postfix) with ESMTP id BFE6637B41A for ; Wed, 2 Jan 2002 16:43:07 -0800 (PST) Received: (qmail 4926 invoked from network); 3 Jan 2002 00:43:06 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail12.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 00:43:06 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201030024.g030Oip60860@apollo.backplane.com> Date: Wed, 02 Jan 2002 16:42:53 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Matthew Dillon wrote: >:Note that critical sections don't impose locking, right now they just disable >:interrupts on the local CPU. Eventually they will also prevent preemptions >:for >:any setrunqueue's done inside a critical section and defer the switches until >:the critical section is exited. If you pin processes/threads to CPU's when >:they get interrupted so they resume on the same CPU and only migrate at >:setrunqueue(), then you still might need to disable interrupts if your update >:of a per-CPU variable isn't atomic since when you return to the thread, it >:might do a modify-write of a stale variable. Think of an interrupt handler >:being interrupted by another interrupt. Thus, I think it would still be wise >:to disable interrupts for per-CPU stuff. At least, for ones that can be >:modified by interrupt handlers. Also, per-thread counters don't need >:locking. > > But if it is protected by a mutex, and an interrupt occurs while you > hold the mutex, the interrupt thread will not be able to run (or > at least will wind up blocking while getting the mutex) until you release > your mutex, at which point your modifications have been synchronized out > (releasing the mutex ensures this). Yes, you can use a mutex and that will work fine. If the data you are protecting is per-CPU then to prevent migration in teh current model you still need to use a critical section. > The critical section stuff would be more palettable if it weren't so > expensive. Couldn't we just have a per-cpu critical section count > and defer the interrupt? (e.g. like the deferred mechanism we used for > spl()s). Then we would have an incredibly cheap mechanism for accessing > per-cpu caches (like per-cpu mbuf freelists, for example) which could > further be adapted for use by zalloc[i]() and malloc(). Err, it does maintain a count right now and only does the actual change of interrupt state when entering and exiting critical sections at the top-level. sti and cli aren't very expensive though. Critical sections are more expensive on other archs though. For example, it's a PAL call on alpha. Although, spl's were PAL calls on alpha as well. On ia64 it's a single instruction but one that requires a stop. Actually, we could allow most critical sections to allow interrupts. The only critical section that really needs to disable interrupts are those that need are in spinlocks shared with bottom half code (namely, the locks in the sio family of drivers and the scheduler lock used to schedule interrupt threads.) In theory, most other critical sections need only tweak their per-thread nesting count (which doesn't need a lock). This has been sitting in the back of my mind for a while but I want to figure out how to do this cleanly. One idea is to allow spinlocks to use slightly different versions of the critical enter/exit calls that do disable interrutps if needed and enable them again when needed. Right now the current critical_enter/exit code assumes that it needs to enable interrupts (or disable them for that matter) when crossing nesting 0. What I could do is add a per-thread variable to mark the first nesting level at which we were actually told to disable interrupts. We could then have the nesting count default to -1 (as a special token meaning interrupts haven't been disabled yet) and when entering a critical section for a spin lock, if the per-thread first-disabled nesting level (needs a better name) is -1, we disable interrupts and save the returned state from cpu_critical_enter() and then restore it on the right critical_exit(). This would require that one not interlock critical_exit with spin locks i.e. do critical_enter() / mtx_lock_spin() / critical_exit(), taht would break in this algo but doesnt' break right now. I'm ok with this limitation though. This won't change the critical API for consumers other than spin locks and just needs a different critical_enter() for spin locks. Hmm, OTOH, I could fix that last case by using a separate nesting count (rather than first nesting level) for hte interrupt disabled critical_enter/exits. That's a bit too complicated though I think. Anyways, this can fit into the current API under the covers as an optimization later on. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17: 9:28 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id E48A537B41A; Wed, 2 Jan 2002 17:09:22 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g0319Jd06655; Thu, 3 Jan 2002 02:09:20 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g0316Ctx046505; Thu, 3 Jan 2002 02:06:12 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g0316CW12934; Thu, 3 Jan 2002 02:06:12 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g0316B562375; Thu, 3 Jan 2002 02:06:11 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 02:06:11 +0100 From: Bernd Walter To: Matthew Dillon Cc: John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103010611.GD53199@cicely9.cicely.de> References: <200201030024.g030Oip60860@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201030024.g030Oip60860@apollo.backplane.com> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 04:24:44PM -0800, Matthew Dillon wrote: > :Note that critical sections don't impose locking, right now they just disable > :interrupts on the local CPU. Eventually they will also prevent preemptions for > :any setrunqueue's done inside a critical section and defer the switches until > :the critical section is exited. If you pin processes/threads to CPU's when > :they get interrupted so they resume on the same CPU and only migrate at > :setrunqueue(), then you still might need to disable interrupts if your update > :of a per-CPU variable isn't atomic since when you return to the thread, it > :might do a modify-write of a stale variable. Think of an interrupt handler > :being interrupted by another interrupt. Thus, I think it would still be wise > :to disable interrupts for per-CPU stuff. At least, for ones that can be > :modified by interrupt handlers. Also, per-thread counters don't need locking. > > But if it is protected by a mutex, and an interrupt occurs while you > hold the mutex, the interrupt thread will not be able to run (or > at least will wind up blocking while getting the mutex) until you release > your mutex, at which point your modifications have been synchronized out > (releasing the mutex ensures this). > > The critical section stuff would be more palettable if it weren't so > expensive. Couldn't we just have a per-cpu critical section count > and defer the interrupt? (e.g. like the deferred mechanism we used for > spl()s). Then we would have an incredibly cheap mechanism for accessing > per-cpu caches (like per-cpu mbuf freelists, for example) which could > further be adapted for use by zalloc[i]() and malloc(). > > I am really beginning to hate not being able to depend on anything > without having to make expensive calls first. sti and cli are very > expensive instructions. > > (We are really talking about two different ways to increment a counter > here... one way using a mutex for protection, the other way using a > critical section and per-cpu space). A mutex includes atomic_acq/rel calls which not only are atomic_ but also require memory barriers, because they protect unnamed space. Using a mutex for accessing 2 or maybe up to 3 values doesn't bring you anything compared to using atomic_ access directly. critical sections doesn't do better. Using atomic_ functions and per CPU counters still sounds best to me unless there are many of them updated in row. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17:12:54 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 5F5D337B416; Wed, 2 Jan 2002 17:12:49 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g031CnN61108; Wed, 2 Jan 2002 17:12:49 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 17:12:49 -0800 (PST) From: Matthew Dillon Message-Id: <200201030112.g031CnN61108@apollo.backplane.com> To: John Baldwin Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> expensive. Couldn't we just have a per-cpu critical section count :> and defer the interrupt? (e.g. like the deferred mechanism we used for :> spl()s). Then we would have an incredibly cheap mechanism for accessing :> per-cpu caches (like per-cpu mbuf freelists, for example) which could :> further be adapted for use by zalloc[i]() and malloc(). : :Err, it does maintain a count right now and only does the actual change of :interrupt state when entering and exiting critical sections at the top-level. :sti and cli aren't very expensive though. Critical sections are more expensive :on other archs though. For example, it's a PAL call on alpha. Although, spl's :were PAL calls on alpha as well. On ia64 it's a single instruction but one that :requires a stop. Most of the critical section code in the system is going to be at the top level. sti and cli *ARE* expensive, that's why the original spl code went to such great lengths to avoid calling them. I believe one or both instructions synchronizes the cpu pipeline, interrupting instruction flow. It is nasty stuff. :Actually, we could allow most critical sections to allow interrupts. The only :critical section that really needs to disable interrupts are those that need :are in spinlocks shared with bottom half code (namely, the locks in the sio I think a general implementation is better. If you introduce all sorts of weird exceptions you only make the code unnecessarily complex and developers programming to the API are far more likely to make serious mistakes and introduce serious bugs into the system. :In theory, most other critical sections need only tweak their per-thread :nesting count (which doesn't need a lock). This has been sitting in the back :of my mind for a while but I want to figure out how to do this cleanly. The critical section code *only* needs to bump/drop a per-thread nesting count, which a simple deferred _cpl check when the critical section is dropped. If an interrupt occurs on the cpu while the current thread is in a critical section, it should be deferred just like we did with _cpl in our spl*() code. critical_enter() (inline) { ++curthread->critical_count; } critical_exit() (inline or inline/procedure hybrid) { if (--curthread->critical_count == 0) { cpl_t cpl; /* * curthread->deferred_cpl is only modified by an interrupt * if we are in a critical section, so we can safely modify * it when we are not in a critical section. * * Alternatively one could do something more complex to avoid * preemptive races, like do a real sti/cli here. * * As with our SPL code, this case rarely occurs so we can * afford to be a bit more heavy-weight when it does occur. */ if ((cpl = curthread->deferred_cpl) != 0) { /* * (This part could be in a hybrid procedure, while the * above section could be in an inline) */ real_sti(); curthread->deferred_cpl = 0; dispatch_deferred_cpl(cpl); real_cli(); } } } :One idea is to allow spinlocks to use slightly different versions of the :critical enter/exit calls that do disable interrutps if needed and enable them :again when needed. Right now the current critical_enter/exit code assumes that I think it is overkill too. Keep it simple and it can be used without having to have a manual reference by your desk to figure out what it actually does. :This won't change the critical API for consumers other than spin locks and just :needs a different critical_enter() for spin locks. Hmm, OTOH, I could fix that :last case by using a separate nesting count (rather than first nesting level) :for hte interrupt disabled critical_enter/exits. That's a bit too complicated :though I think. Anyways, this can fit into the current API under the covers as :an optimization later on. : :-- : :John Baldwin <>< http://www.FreeBSD.org/~jhb/ Yes, definitely too complicated. But I disagree with the 'as an optimization later on'. The perceived cost of these functions colors everyone's implementation of code around them, and some pretty nasty code (due to people trying to avoid perceived-to-be expensive calls) can be the result. Something like this needs to be implemented right off the bat so people can depend on it. I'm willing to do this, if you don't have your fingers in this particular section of code. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17:19:47 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 0B72C37B417; Wed, 2 Jan 2002 17:19:44 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g031JOJ06834; Thu, 3 Jan 2002 02:19:24 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g031Gktx046537; Thu, 3 Jan 2002 02:16:46 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g031GkW12942; Thu, 3 Jan 2002 02:16:46 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g031Gj862401; Thu, 3 Jan 2002 02:16:45 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 02:16:45 +0100 From: Bernd Walter To: Peter Jeremy Cc: Michal Mertl , Matthew Dillon , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103011645.GE53199@cicely9.cicely.de> References: <200201012349.g01NnKA40071@apollo.backplane.com> <20020103095701.B561@gsmx07.alcatel.com.au> <20020103002521.GB53199@cicely9.cicely.de> <20020103113919.E561@gsmx07.alcatel.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020103113919.E561@gsmx07.alcatel.com.au> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 11:39:20AM +1100, Peter Jeremy wrote: > On 2002-Jan-03 01:25:22 +0100, Bernd Walter wrote: > >My Alpha Architecture Handbook says that the barrier is unneeded. > >I have no clue why they are there. > > You're right. Version 2, section 5.5 shows that they aren't needed > when a single datum is being atomically updated (as needed here). > They're only needed where the atomic operation is seizing a lock so > that a larger structure can be atomically updated. I will do this change localy a send a patch to -alpha. Maybe we can also remove the barriers for rel/acq in the non SMP case, but I could also be wrong if drivers depend on them. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17:30:22 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail6.speakeasy.net (mail6.speakeasy.net [216.254.0.206]) by hub.freebsd.org (Postfix) with ESMTP id A2AE737B41A for ; Wed, 2 Jan 2002 17:30:17 -0800 (PST) Received: (qmail 20867 invoked from network); 3 Jan 2002 01:30:15 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail6.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 01:30:15 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20020103003214.GC53199@cicely9.cicely.de> Date: Wed, 02 Jan 2002 17:30:02 -0800 (PST) From: John Baldwin To: Bernd Walter Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG, Matthew Dillon Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Bernd Walter wrote: > On Wed, Jan 02, 2002 at 04:02:14PM -0800, Matthew Dillon wrote: >> :Look at PCPU_GET/PCPU_SET. Note that since an interrupt can preempt you >> :and >> :push you off onto another CPU, you have to use a critical section while >> :updating per-CPU variables. If desired, some kind of free area could be >> :stuck >> :in struct pcpu (or more likely, struct pcpu would hold a pointer to the >> :area) >> :that could be galloc/gfree'd or some such. >> : >> :-- >> : >> :John Baldwin <>< http://www.FreeBSD.org/~jhb/ >> :"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ >> >> Maybe we are going about this all wrong. If a particular interface >> counter can only be modified from the device interrupt, or only be >> modified while holding the appropriate mutex, do we need any locking >> at all? > > You need to hold the mutex while writing and reading. > If you hold the mutex only while writing another CPU might still use > old cached values. Yes. > The same goes for device interrrupt as it doesn't enshure inter CPU > consistency. Yes. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17:30:34 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail11.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by hub.freebsd.org (Postfix) with ESMTP id F1B0137B41C for ; Wed, 2 Jan 2002 17:30:18 -0800 (PST) Received: (qmail 4651 invoked from network); 3 Jan 2002 01:30:16 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 01:30:16 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201030112.g031CnN61108@apollo.backplane.com> Date: Wed, 02 Jan 2002 17:30:03 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Matthew Dillon wrote: >:Actually, we could allow most critical sections to allow interrupts. The >:only >:critical section that really needs to disable interrupts are those that need >:are in spinlocks shared with bottom half code (namely, the locks in the sio > > I think a general implementation is better. If you introduce all > sorts of weird exceptions you only make the code unnecessarily complex > and developers programming to the API are far more likely to make > serious mistakes and introduce serious bugs into the system. Err, no. Only spin locks need actually disable interrupts. > The critical section code *only* needs to bump/drop a per-thread > nesting count, which a simple deferred _cpl check when the critical > section is dropped. If an interrupt occurs on the cpu while the current > thread is in a critical section, it should be deferred just like we > did with _cpl in our spl*() code. Have you even looked at the current implementation? :) > critical_enter() (inline) > { > ++curthread->critical_count; > } > > critical_exit() (inline or inline/procedure hybrid) > { > if (--curthread->critical_count == 0) { > cpl_t cpl; > > /* > * curthread->deferred_cpl is only modified by an interrupt > * if we are in a critical section, so we can safely modify > * it when we are not in a critical section. > * > * Alternatively one could do something more complex to avoid > * preemptive races, like do a real sti/cli here. > * > * As with our SPL code, this case rarely occurs so we can > * afford to be a bit more heavy-weight when it does occur. > */ > if ((cpl = curthread->deferred_cpl) != 0) { > /* > * (This part could be in a hybrid procedure, while the > * above section could be in an inline) > */ > real_sti(); > curthread->deferred_cpl = 0; > dispatch_deferred_cpl(cpl); > real_cli(); > } > } > } This requires the bottom-half code to not use any locks at all which might be possible. However it means if I have an idle CPU, it can't pick up the interrupt handler for the interrupt that came in until the CPU that gets its interrupt leaves the handler. If you want to prevent that, you need to allow access to the scheduler when the interrupt comes in, which means that when top half code holds the sched_lock it needs to disable interrupts. >:critical enter/exit calls that do disable interrutps if needed and enable >:them >:again when needed. Right now the current critical_enter/exit code assumes >:that > > I think it is overkill too. Keep it simple and it can be used without > having to have a manual reference by your desk to figure out what it > actually does. Err, it's not near that complicated. This only changes the calls to critical_enter/exit in one location. I was saying that the separate nesting count was a bit too complicated. >:This won't change the critical API for consumers other than spin locks and >:just >:needs a different critical_enter() for spin locks. Hmm, OTOH, I could fix >:that >:last case by using a separate nesting count (rather than first nesting level) >:for hte interrupt disabled critical_enter/exits. That's a bit too >:complicated >:though I think. Anyways, this can fit into the current API under the covers >:as >:an optimization later on. >: >:-- >: >:John Baldwin <>< http://www.FreeBSD.org/~jhb/ > > Yes, definitely too complicated. But I disagree with the 'as an > optimization later on'. The perceived cost of these functions colors > everyone's implementation of code around them, and some pretty nasty > code (due to people trying to avoid perceived-to-be expensive calls) > can be the result. Something like this needs to be implemented right > off the bat so people can depend on it. No. People need to write algorithms and not assume that implementation specifics about certain functions. There is a manpage (though it needs a bit of updating) for the critcal section stuff and it still accurately documents what the API is guaranteed to provide: protection from preemption. > I'm willing to do this, if you don't have your fingers in this particular > section of code. Actually, I have some changes in this code already including halfway implementing this. > -Matt > Matthew Dillon > -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17:58:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 7BF8737B417; Wed, 2 Jan 2002 17:58:33 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g031wWD61364; Wed, 2 Jan 2002 17:58:32 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 17:58:32 -0800 (PST) From: Matthew Dillon Message-Id: <200201030158.g031wWD61364@apollo.backplane.com> To: John Baldwin Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> section is dropped. If an interrupt occurs on the cpu while the current :> thread is in a critical section, it should be deferred just like we :> did with _cpl in our spl*() code. : :Have you even looked at the current implementation? :) Yes, I have John. Perhaps you didn't read the part of my email that indicated most of the critical sections we have in the system are top-level sections. Optimizing the nesting is all well and fine but it isn't going to improve peformance. Let me put this in perspective, to show you how twisted the code has become. * Because preemption can cause a thread to resume on a different cpu, the per-cpu area pointer is not consistent and per-cpu variables must be accessed atomically with the pointer. * This means that obtaining 'curthread', for example, becomes relatively non-cacheable in terms of code generation. Every time you get it you have to re-obtain it. So a great deal of our code does this in order to work around the lack of optimization: struct thread *td = curthread; ... use td multiple times ... * critical_enter() is a procedure, not a macro. So for something that should be as simple as '++curthread->td_critnest' which could be two instructions we instead have a 36 byte procedure which does pushes, moves, pops, comparisons, and a nasty cli instruction. Now, John, compare that to the spl*() code in -stable. Something like splvm() is a 32 byte procedure doing a couple of mov's from globals (three on the same cache line... with a little effort all could be on the same cache line), a couple of or's, and that is it. No cli instruction. No conditionals. It almost doesn't have to be a procedure but the mask is just complex enough that it's a good idea. Not so for critical_enter(). If properly done that could be an inline that increments a counter in curthread. Two instructions and about 10x faster. Critical exit is somewhat more complex but still very easy. :This requires the bottom-half code to not use any locks at all which might be :possible. However it means if I have an idle CPU, it can't pick up the :interrupt handler for the interrupt that came in until the CPU that gets its :interrupt leaves the handler. If you want to prevent that, you need to allow Yes, but so what? The cpu holding the critical section isn't going to be in that section for more then a few clock cycles most likely. Just as with our spl*() mechanism, the chance of an interrupt occuring while in the critical section is extremely low. That's why it doesn't matter if it is slightly heavier-weight when the case does occur. We gain far more clock cycles in performance then we lose in the occassional interrupt that hits us in a critical section. :access to the scheduler when the interrupt comes in, which means that when top :half code holds the sched_lock it needs to disable interrupts. : :>:critical enter/exit calls that do disable interrutps if needed and enable :>:them :>:again when needed. Right now the current critical_enter/exit code assumes :>:that :> :> I think it is overkill too. Keep it simple and it can be used without :> having to have a manual reference by your desk to figure out what it :> actually does. : :Err, it's not near that complicated. This only changes the calls to :critical_enter/exit in one location. I was saying that the separate nesting :count was a bit too complicated. I have to disagree. When you start adding up all the special casing in all the supposedly uncomplicated procedures you wind up with something that is unnecessarily complex and definitely overcomplicated. :> Yes, definitely too complicated. But I disagree with the 'as an :> optimization later on'. The perceived cost of these functions colors :> everyone's implementation of code around them, and some pretty nasty :> code (due to people trying to avoid perceived-to-be expensive calls) :> can be the result. Something like this needs to be implemented right :> off the bat so people can depend on it. : :No. People need to write algorithms and not assume that implementation :specifics about certain functions. There is a manpage (though it needs a bit :of updating) for the critcal section stuff and it still accurately documents :what the API is guaranteed to provide: protection from preemption. I don't understand what you are saying here. The algorithm I described is platform independant (just as our spl mechanism was). What more can one ask? It's great.. an API that everyone can depend on performing well, exactly as advertised, across a multitude of platforms. :> I'm willing to do this, if you don't have your fingers in this particular :> section of code. : :Actually, I have some changes in this code already including halfway :implementing this. You have half-way implemented just about everything I've ever tried to do work on in -current. Is there anything you haven't touched? You have half the core locked up and untouchable with all the work you have in progress. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 17:58:58 2002 Delivered-To: freebsd-arch@freebsd.org Received: from netau1.alcanet.com.au (ntp.alcanet.com.au [203.62.196.27]) by hub.freebsd.org (Postfix) with ESMTP id 80D7F37B41C for ; Wed, 2 Jan 2002 17:58:54 -0800 (PST) Received: from mfg1.cim.alcatel.com.au (mfg1.cim.alcatel.com.au [139.188.23.1]) by netau1.alcanet.com.au (8.9.3 (PHNE_22672)/8.9.3) with ESMTP id MAA12361; Thu, 3 Jan 2002 12:58:42 +1100 (EDT) Received: from gsmx07.alcatel.com.au by cim.alcatel.com.au (PMDF V5.2-32 #37641) with ESMTP id <01KCMVNKJH00VFM67E@cim.alcatel.com.au>; Thu, 3 Jan 2002 12:57:49 +1100 Received: (from jeremyp@localhost) by gsmx07.alcatel.com.au (8.11.6/8.11.6) id g031wdM04443; Thu, 03 Jan 2002 12:58:40 +1100 Content-return: prohibited Date: Thu, 03 Jan 2002 12:58:39 +1100 From: Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-reply-to: <20020103011645.GE53199@cicely9.cicely.de>; from ticso@cicely9.cicely.de on Thu, Jan 03, 2002 at 02:16:45AM +0100 To: Bernd Walter Cc: arch@FreeBSD.ORG Mail-Followup-To: Bernd Walter , arch@FreeBSD.ORG Message-id: <20020103125839.H561@gsmx07.alcatel.com.au> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline User-Agent: Mutt/1.2.5i References: <200201012349.g01NnKA40071@apollo.backplane.com> <20020103095701.B561@gsmx07.alcatel.com.au> <20020103002521.GB53199@cicely9.cicely.de> <20020103113919.E561@gsmx07.alcatel.com.au> <20020103011645.GE53199@cicely9.cicely.de> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG [Pruning CC list] On 2002-Jan-03 02:16:45 +0100, Bernd Walter wrote: >On Thu, Jan 03, 2002 at 11:39:20AM +1100, Peter Jeremy wrote: >> On 2002-Jan-03 01:25:22 +0100, Bernd Walter wrote: >> >My Alpha Architecture Handbook says that the barrier is unneeded. >> >I have no clue why they are there. >> >> You're right. Version 2, section 5.5 shows that they aren't needed >> when a single datum is being atomically updated (as needed here). >> They're only needed where the atomic operation is seizing a lock so >> that a larger structure can be atomically updated. > >I will do this change localy a send a patch to -alpha. >Maybe we can also remove the barriers for rel/acq in the non SMP case, >but I could also be wrong if drivers depend on them. My understanding is that barriers are needed when it's necessary to control the content of main memory or read/write ordering, as seen by something other than the current CPU. In a UP system, this means that barriers are still needed for I/O (both PIO and DMA). Based on atomic(9), I think the barriers are still necessary for acq/rel - which are essentially acquiring/releasing locks on larger data structures. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 18:33:34 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail11.speakeasy.net (mail11.speakeasy.net [216.254.0.211]) by hub.freebsd.org (Postfix) with ESMTP id 0A39F37B41A for ; Wed, 2 Jan 2002 18:33:28 -0800 (PST) Received: (qmail 869 invoked from network); 3 Jan 2002 02:33:26 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail11.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 02:33:26 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201030158.g031wWD61364@apollo.backplane.com> Date: Wed, 02 Jan 2002 18:33:13 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Matthew Dillon wrote: >:> section is dropped. If an interrupt occurs on the cpu while the >:> current >:> thread is in a critical section, it should be deferred just like we >:> did with _cpl in our spl*() code. >: >:Have you even looked at the current implementation? :) > > Yes, I have John. Perhaps you didn't read the part of my email > that indicated most of the critical sections we have in the system > are top-level sections. Optimizing the nesting is all well and fine > but it isn't going to improve peformance. > > Let me put this in perspective, to show you how twisted the code > has become. > > * Because preemption can cause a thread to resume on a different cpu, > the per-cpu area pointer is not consistent and per-cpu variables > must be accessed atomically with the pointer. > > * This means that obtaining 'curthread', for example, becomes relatively > non-cacheable in terms of code generation. Every time you get it you > have to re-obtain it. So a great deal of our code does this in order > to work around the lack of optimization: No, this is incorrect. curthread and curpcb are a bit special here. They are valid even across migration. > struct thread *td = curthread; > > ... use td multiple times ... This is done cause PCPU_GET() is slow. Perhaps that is what you are trying to say in that our PCPU_GET() macros are slow? On x86 they are one instruction for most cases, and on other archs they are 2 or 3 at most. > * critical_enter() is a procedure, not a macro. So for something that > should be as simple as '++curthread->td_critnest' which could be > two instructions we instead have a 36 byte procedure which does > pushes, moves, pops, comparisons, and a nasty cli instruction. It can be made a macro trivially if we want do it later. However, during earlier work when one may want to put extra stuff in it such as assertions it is ok for it to be a function for the time being. Microptimizing one function is well and good and all, but in -current there are really bigger fish to fry. >:This requires the bottom-half code to not use any locks at all which might be >:possible. However it means if I have an idle CPU, it can't pick up the >:interrupt handler for the interrupt that came in until the CPU that gets its >:interrupt leaves the handler. If you want to prevent that, you need to allow > > Yes, but so what? The cpu holding the critical section isn't going to > be in that section for more then a few clock cycles most likely. Fair enough. I suppose we could use a per-thread bitmask of pending interrupts. Hmm, except that that assumes a dense region of vectors which alpha doesn't have. Note that on the alpha we did spl's via the x86 equivalent of cli/sti for this reason. On the alpha vector numbers are rather sparse and won't fit into a bitmask. You can't use an embedded list in struct ithd in case multiple CPU's get the same interrupt in a critical section. Also, now we have the problem that we can't use the spin lock to protect the ICU on SMP systems, meaning that we will corrupt the ICU masks now when disabling interrupts (which we _have_ to do if interrupts are level-triggered as they are for PCI interrupts are on alpha). I'm afraid the issues here are a bit more complex than it first seems. However, the idea of deferring interrupts is a nice one if we can do it. It is actually very similar to the way preemption will/would defer switches to real-time threads until the outermost critical_exit(). >:>:critical enter/exit calls that do disable interrutps if needed and enable >:>:them >:>:again when needed. Right now the current critical_enter/exit code assumes >:>:that >:> >:> I think it is overkill too. Keep it simple and it can be used without >:> having to have a manual reference by your desk to figure out what it >:> actually does. >: >:Err, it's not near that complicated. This only changes the calls to >:critical_enter/exit in one location. I was saying that the separate nesting >:count was a bit too complicated. > > I have to disagree. When you start adding up all the special casing > in all the supposedly uncomplicated procedures you wind up with something > that is unnecessarily complex and definitely overcomplicated. Err, I was going to use separate procedures precisely to avoid having to complicate critical_enter/exit. critical_enter/exit_spinlock or some such which would only be used in kern_mutex.c and sys/mutex.h in one place in each file. >:> Yes, definitely too complicated. But I disagree with the 'as an >:> optimization later on'. The perceived cost of these functions colors >:> everyone's implementation of code around them, and some pretty nasty >:> code (due to people trying to avoid perceived-to-be expensive calls) >:> can be the result. Something like this needs to be implemented right >:> off the bat so people can depend on it. >: >:No. People need to write algorithms and not assume that implementation >:specifics about certain functions. There is a manpage (though it needs a bit >:of updating) for the critcal section stuff and it still accurately documents >:what the API is guaranteed to provide: protection from preemption. > > I don't understand what you are saying here. The algorithm I described > is platform independant (just as our spl mechanism was). What more can > one ask? It's great.. an API that everyone can depend on performing > well, exactly as advertised, across a multitude of platforms. No, I think your algorithm is very i386 specific to be honest. The spl mechanisms differend widely on i386 and alpha. It may be that this code really needs to be MD as the spl code was instead of MI to optimize it, but we don't need it right now. Your argument is that people will not use critical_enter because of its implementation details, and I'm saying that people should use the API because of the service it presents and should not be worried about how it is implemented. >:> I'm willing to do this, if you don't have your fingers in this >:> particular >:> section of code. >: >:Actually, I have some changes in this code already including halfway >:implementing this. > > You have half-way implemented just about everything I've ever tried to > do work on in -current. Is there anything you haven't touched? > You have half the core locked up and untouchable with all the work you > have in progress. No, I do not have half the code locked up by any means. I do have a change I need to test related to the critical_enter/exit stuff that I want to commit in a day or so though. As far as changing critical_enter/exit to defer, I'm not going to work on that right now except to think about it. Specifically, not only must critical_enter/exit change, but the various MD code that calls ithread_schedule() needs to defer those calls. Also, there is the problem of the icu lock, and the problem of figuring a way to store the per-thread state of triggered interrupts for each arch. If there is a MI way, that would be best. Well, the embedded list of ithreads might work actually. But that doesn't fix icu_lock. > -Matt -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 18:46:54 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 0F59F37B416 for ; Wed, 2 Jan 2002 18:46:49 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g032klS11126 for arch@FreeBSD.ORG; Thu, 3 Jan 2002 03:46:47 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g032hHtx047219 for ; Thu, 3 Jan 2002 03:43:17 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g032hGW13211 for ; Thu, 3 Jan 2002 03:43:16 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g032hD962782 for arch@FreeBSD.ORG; Thu, 3 Jan 2002 03:43:13 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 03:43:13 +0100 From: Bernd Walter To: arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103024313.GG53199@cicely9.cicely.de> References: <200201012349.g01NnKA40071@apollo.backplane.com> <20020103095701.B561@gsmx07.alcatel.com.au> <20020103002521.GB53199@cicely9.cicely.de> <20020103113919.E561@gsmx07.alcatel.com.au> <20020103011645.GE53199@cicely9.cicely.de> <20020103125839.H561@gsmx07.alcatel.com.au> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20020103125839.H561@gsmx07.alcatel.com.au> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 12:58:39PM +1100, Peter Jeremy wrote: > [Pruning CC list] > > On 2002-Jan-03 02:16:45 +0100, Bernd Walter wrote: > >On Thu, Jan 03, 2002 at 11:39:20AM +1100, Peter Jeremy wrote: > >> On 2002-Jan-03 01:25:22 +0100, Bernd Walter wrote: > >> >My Alpha Architecture Handbook says that the barrier is unneeded. > >> >I have no clue why they are there. > >> > >> You're right. Version 2, section 5.5 shows that they aren't needed > >> when a single datum is being atomically updated (as needed here). > >> They're only needed where the atomic operation is seizing a lock so > >> that a larger structure can be atomically updated. > > > >I will do this change localy a send a patch to -alpha. > >Maybe we can also remove the barriers for rel/acq in the non SMP case, > >but I could also be wrong if drivers depend on them. > > My understanding is that barriers are needed when it's necessary to > control the content of main memory or read/write ordering, as seen by > something other than the current CPU. In a UP system, this means that > barriers are still needed for I/O (both PIO and DMA). Based on > atomic(9), I think the barriers are still necessary for acq/rel - which > are essentially acquiring/releasing locks on larger data structures. On alpha the barriers don't control memory content - only order. In fact if you protect a data structure with ldx_l/stx_c/mb your data writen may still be in the CPU cache after releasing the lock. stx_c places a lock on the cache line if it succeds but may not write any memory. You can read this in chapter 4.6 in the Alpha 21264 Microprocessor Hardware Reference Manual. The mb instructions are needed to get the protected data over the coherency point in case the lock is handed to another CPU. Hardware don't use atomic_acq so it needs other mechanisms. Anothere point is that non read-modify-write based atomic_rel functions don't need to use locked instructions. E.g. atomic_store_rel_ptr used by MI mutex functions. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 19:34:24 2002 Delivered-To: freebsd-arch@freebsd.org Received: from niwun.pair.com (niwun.pair.com [209.68.2.70]) by hub.freebsd.org (Postfix) with SMTP id 89A8337B416 for ; Wed, 2 Jan 2002 19:34:21 -0800 (PST) Received: (qmail 34844 invoked by uid 3193); 3 Jan 2002 03:34:20 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 3 Jan 2002 03:34:20 -0000 Date: Wed, 2 Jan 2002 22:34:20 -0500 (EST) From: Mike Silbersack X-Sender: To: Subject: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: <200201030210.g032AVb19379@freefall.freebsd.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 2 Jan 2002, Josef Karthauser wrote: > joe 2002/01/02 18:10:31 PST > > Modified files: > sys/dev/usb uhci.c > Log: > Sync with NetBSD: > > * White space changes. > * Updates to comments. > * Replace some delay() calls with usb_delay_ms(). This commit log started me wondering about DELAY usage in the kernel. Looking around, we do have a decent amount of such calls. Most of the calls seem to be for 100 us or less, with some as short a 1 us. These times seem short, but looking at sys/i386/isa/clock.c, I see that we time based off the i8254 timer chip, rather than the processor's TSC. As such, are we actually reaching microsecond accuracy, or is the delay actually taking longer than expected in many of these cases? Thanks, Mike "Silby" Silbersack To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 19:51: 3 2002 Delivered-To: freebsd-arch@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id 5003037B41E; Wed, 2 Jan 2002 19:51:00 -0800 (PST) Received: from pool0067.cvx22-bradley.dialup.earthlink.net ([209.179.198.67] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16Lyu8-00040M-00; Wed, 02 Jan 2002 19:50:56 -0800 Message-ID: <3C33D3E3.38E758CA@mindspring.com> Date: Wed, 02 Jan 2002 19:45:39 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: Alfred Perlstein , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> <200201030012.g030Cgp60752@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > If we standardize the mutex used by interface device-driver interrupts > (if it isn't already done), the collector could obtain the mutex when > reading the counter, yes? IMO, statistics are snapshots in any case, so it doesn't really matter if they are incredibly accurate on read. In any case, multiple reader, single write doesn't need locks unless it has to be done atomically. An easy way to do this atomically, even for fairly large items, is to toggle between an active and inactive structure of data, where the pointer to the strucutre is assigned atomically. I used this property for my zero system call gettimeofday(), for example. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 19:51: 4 2002 Delivered-To: freebsd-arch@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id 4BEAA37B422; Wed, 2 Jan 2002 19:51:01 -0800 (PST) Received: from pool0067.cvx22-bradley.dialup.earthlink.net ([209.179.198.67] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16LyuB-00043C-00; Wed, 02 Jan 2002 19:50:59 -0800 Message-ID: <3C33D4FA.3F7B09CD@mindspring.com> Date: Wed, 02 Jan 2002 19:50:18 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030024.g030Oip60860@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > But if it is protected by a mutex, and an interrupt occurs while you > hold the mutex, the interrupt thread will not be able to run (or > at least will wind up blocking while getting the mutex) until you release > your mutex, at which point your modifications have been synchronized out > (releasing the mutex ensures this). It's a freaking statistic. Just keep "private" and "published" versions, increment the private version if you can't get the mutex, and increment the public and add any private to the public when you can. If you want to be even more clueful, then always inclrement the priuvate, and add the private to the public only if you can get the mutex when reading the statistics. This will make the code at interrupt run in deterministic time, which is a useful property. > The critical section stuff would be more palettable if it weren't so > expensive. Couldn't we just have a per-cpu critical section count > and defer the interrupt? (e.g. like the deferred mechanism we used for > spl()s). Then we would have an incredibly cheap mechanism for accessing > per-cpu caches (like per-cpu mbuf freelists, for example) which could > further be adapted for use by zalloc[i]() and malloc(). It's a statistic: defer the statistic update, not the interrupt. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 19:52:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id 02D6537B416; Wed, 2 Jan 2002 19:52:35 -0800 (PST) Received: from pool0067.cvx22-bradley.dialup.earthlink.net ([209.179.198.67] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16Lyvf-0005gR-00; Wed, 02 Jan 2002 19:52:31 -0800 Message-ID: <3C33D580.50B5BCAA@mindspring.com> Date: Wed, 02 Jan 2002 19:52:32 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Bernd Walter Cc: Matthew Dillon , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020103003214.GC53199@cicely9.cicely.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bernd Walter wrote: > You need to hold the mutex while writing and reading. > If you hold the mutex only while writing another CPU might still use > old cached values. Unless there are two sounts that MUST remain synchornized for correct operation, you don't *care* if someone gets the stale value. Ask yourself: what's the worst case failure scenario that would result? -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 20:10:54 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 4CBBD37B419; Wed, 2 Jan 2002 20:10:47 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g034AkP61865; Wed, 2 Jan 2002 20:10:46 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 20:10:46 -0800 (PST) From: Matthew Dillon Message-Id: <200201030410.g034AkP61865@apollo.backplane.com> To: John Baldwin Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> non-cacheable in terms of code generation. Every time you get it you :> have to re-obtain it. So a great deal of our code does this in order :> to work around the lack of optimization: : :No, this is incorrect. curthread and curpcb are a bit special here. They are :valid even across migration. Yes they are... but the code that retrieves them can't be optimized by GCC to deal with multiple references. That's one of the problems. :> ... use td multiple times ... : :This is done cause PCPU_GET() is slow. Perhaps that is what you are trying to :say in that our PCPU_GET() macros are slow? On x86 they are one instruction :for most cases, and on other archs they are 2 or 3 at most. No, what I am saying is that they are designed in a manner that makes it pretty much impossible for GCC or a human to optimize. The fact that you cannot get a stable pointer to the per-cpu area (without being in a critical section) and then access the variables via standard structural indirection has led to all manner of bloat in the API. It turns what ought to be a simple matter of a cpu-relative access into a mess of macros and makes it difficult to take advantage of the per-cpu data area for anything general purpose. If we had conceptually 'instant' critical section handling it would not be as big a deal, but we have a bad combination here... critical sections are expensive (almost as expensive as getting a mutex), and it is not possible to optimize general per-cpu structural references. :> * critical_enter() is a procedure, not a macro. So for something that :> should be as simple as '++curthread->td_critnest' which could be :> two instructions we instead have a 36 byte procedure which does :> pushes, moves, pops, comparisons, and a nasty cli instruction. : :It can be made a macro trivially if we want do it later. However, during :earlier work when one may want to put extra stuff in it such as assertions it :is ok for it to be a function for the time being. Microptimizing one function :is well and good and all, but in -current there are really bigger fish to fry. The function/macroness is not the biggest problem here. It's the cli/sti junk and the general overhead in the code that is the problem. What should be an extremely light weight function that nobody should have a second thought about using when they need it isn't even close to being light weight. Our critical section code could be two (unsynchronized) instructions on nearly all architectures with only a moderate (and very well known since it would be similar to the SPL mechanism) amount of assembly. We have the same problem with these functions that we had with the spl*() code before we optimized it. People were uncomfortable with the routines and the API and often did all sorts of weird things to try to avoid them due to their perceived overhead. We had (and still have) unnecessarily long spl'd loops in -stable because of that. It is also the same problem we have with mutexes. Look at what mutex (current) and simplelock (stable) overhead did to the filesystem update code? (SYNC/FSYNC). System with large amounts of memory and a moderate amount of dirty data wound up stalling long enough to be noticeable every 30 seconds. Just having a documented API isn't good enough. There is a lot more to it then that and it effects the way people code. The APIs in -current are nasty enough that I have a strong desire to throw the whole damn thing away because I'm afraid of the code people will wind up writing to get around their inefficiencies. :>:This requires the bottom-half code to not use any locks at all which might be :>:possible. However it means if I have an idle CPU, it can't pick up the :>:interrupt handler for the interrupt that came in until the CPU that gets its :>:interrupt leaves the handler. If you want to prevent that, you need to allow :> :> Yes, but so what? The cpu holding the critical section isn't going to :> be in that section for more then a few clock cycles most likely. : :Fair enough. I suppose we could use a per-thread bitmask of pending interrupts. :Hmm, except that that assumes a dense region of vectors which alpha doesn't :have. Note that on the alpha we did spl's via the x86 equivalent of cli/sti :for this reason. On the alpha vector numbers are rather sparse and won't :fit into a bitmask. You can't use an embedded list in struct ithd in case :multiple CPU's get the same interrupt in a critical section. Also, now we have :the problem that we can't use the spin lock to protect the ICU on SMP systems, :meaning that we will corrupt the ICU masks now when disabling interrupts (which :we _have_ to do if interrupts are level-triggered as they are for PCI :interrupts are on alpha). interrupt code == interrupts disabled on entry usually, which means you can create a queue in the per-cpu data area of pending interrupts which are to be executed when the critical section is left. And, I will also add, that this also greatly improves the flexibility we have in dealing with interrupts. It would even be possible to have an idle cpu check an active cpu's pending interrupt queue and steal one. :I'm afraid the issues here are a bit more complex than it first seems. :However, the idea of deferring interrupts is a nice one if we can do it. It is :actually very similar to the way preemption will/would defer switches to :real-time threads until the outermost critical_exit(). Of course we can do it. It's how we do SPLs in -stable after all. It is a very well known and well proven mechanism. :> in all the supposedly uncomplicated procedures you wind up with something :> that is unnecessarily complex and definitely overcomplicated. : :Err, I was going to use separate procedures precisely to avoid having to :complicate critical_enter/exit. critical_enter/exit_spinlock or some such :which would only be used in kern_mutex.c and sys/mutex.h in one place in each :file. Yuch. It is a simple matter to standardize the meaning of td->td_critnest but the full implementation of 'critical_enter' and 'critical_exit' should not be split up into MD and MI parts. It just makes the code less readable. They should simply be MD. :> I don't understand what you are saying here. The algorithm I described :> is platform independant (just as our spl mechanism was). What more can :> one ask? It's great.. an API that everyone can depend on performing :> well, exactly as advertised, across a multitude of platforms. : :No, I think your algorithm is very i386 specific to be honest. The spl :mechanisms differend widely on i386 and alpha. It may be that this code really :needs to be MD as the spl code was instead of MI to optimize it, but we don't :need it right now. Your argument is that people will not use critical_enter :because of its implementation details, and I'm saying that people should use :the API because of the service it presents and should not be worried about how :it is implemented. The basic _cpl concept was and still is platform independant. The concept of 'defering an interrupt' is certainly platform independant. I don't see how this could be any less independant. :No, I do not have half the code locked up by any means. I do have a change I :need to test related to the critical_enter/exit stuff that I want to commit in :a day or so though. As far as changing critical_enter/exit to defer, I'm not :going to work on that right now except to think about it. Specifically, not :only must critical_enter/exit change, but the various MD code that calls :ithread_schedule() needs to defer those calls. Also, there is the problem of :the icu lock, and the problem of figuring a way to store the per-thread state :of triggered interrupts for each arch. If there is a MI way, that would be :best. Well, the embedded list of ithreads might work actually. But that :doesn't fix icu_lock. These are all problems we faced before. I had to deal with all of these issues when I did that first pass on the 386 code when we first started to scrap the _cpl stuff, incorporate Giant, and implement the idle thread. As I said, I can do this. It would take me about a week for I386. The other architectures would be able to stick with the sti/cli equivalent until their assembly gurus clean them up (if they even care).... what I am proposing is completely compatible. -- In regards to mutex spins, this actually simplifies matters. The scheduler can schedule any pending interrupts while it has the scheduler mutex held, even if both the previous thread and what would be the next thread have a non-zero td_critnest. You can avoid nearly *ALL* hard interrupt disablements -- just one place in the core of the scheduler (that we already have) is all you would need. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 20:13:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from avocet.prod.itd.earthlink.net (avocet.mail.pas.earthlink.net [207.217.120.50]) by hub.freebsd.org (Postfix) with ESMTP id B45DF37B419; Wed, 2 Jan 2002 20:13:28 -0800 (PST) Received: from pool0067.cvx22-bradley.dialup.earthlink.net ([209.179.198.67] helo=mindspring.com) by avocet.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16LzFv-0004rA-00; Wed, 02 Jan 2002 20:13:27 -0800 Message-ID: <3C33DA68.8E9700D4@mindspring.com> Date: Wed, 02 Jan 2002 20:13:28 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Matthew Dillon Cc: John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030158.g031wWD61364@apollo.backplane.com> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > Let me put this in perspective, to show you how twisted the code > has become. > > * Because preemption can cause a thread to resume on a different cpu, > the per-cpu area pointer is not consistent and per-cpu variables > must be accessed atomically with the pointer. This is sorta bogus in general. CPU migration needs to be treated as an extraordinary event. This is why I suggested per CPU queues for scheduling, oh so long ago: it deals with the issue without causing all the complex cruft that Linux goes trough to try to get thread group affinity in the scheduler, and it makes it real easy to do negaffinity to maximize concurrency for a multithreaded program on a multi-CPU box. > * This means that obtaining 'curthread', for example, becomes relatively > non-cacheable in terms of code generation. Every time you get it you > have to re-obtain it. So a great deal of our code does this in order > to work around the lack of optimization: This has to be done anyway, or rather it's implicit anyway. I think this problem will be solving itself, eventually, and that eating the overhead is OK, for now, in the rare cases, it's really necessary. > :No. People need to write algorithms and not assume that implementation > :specifics about certain functions. There is a manpage (though it needs a bit > :of updating) for the critcal section stuff and it still accurately documents > :what the API is guaranteed to provide: protection from preemption. > > I don't understand what you are saying here. The algorithm I described > is platform independant (just as our spl mechanism was). What more can > one ask? It's great.. an API that everyone can depend on performing > well, exactly as advertised, across a multitude of platforms. I also think it's dangerous to do this. The contention domains are something I would rather be micromanaged, with a *lot* of assumptions about implementation: as many are necessary to get rid of the contention in all the common code cases -- or mitigate them as much as possible, anyway. PS: As to the "half the core locked up" comment: just implement, and whatever is best is what gets used, even if that means backing stuff out. FreeBSD has a long and glorious history of using this approach. 8-). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 20:58:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by hub.freebsd.org (Postfix) with ESMTP id BFBE037B41A; Wed, 2 Jan 2002 20:58:53 -0800 (PST) Received: from peter3.wemm.org ([12.232.27.13]) by rwcrmhc51.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020103045849.NASN20119.rwcrmhc51.attbi.com@peter3.wemm.org>; Thu, 3 Jan 2002 04:58:49 +0000 Received: from overcee.netplex.com.au (overcee.wemm.org [10.0.0.3]) by peter3.wemm.org (8.11.0/8.11.0) with ESMTP id g034wns31300; Wed, 2 Jan 2002 20:58:49 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by overcee.netplex.com.au (Postfix) with ESMTP id 5B53039EC; Wed, 2 Jan 2002 20:58:49 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Matthew Dillon Cc: John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <200201030112.g031CnN61108@apollo.backplane.com> Date: Wed, 02 Jan 2002 20:58:49 -0800 From: Peter Wemm Message-Id: <20020103045849.5B53039EC@overcee.netplex.com.au> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > :> expensive. Couldn't we just have a per-cpu critical section count > :> and defer the interrupt? (e.g. like the deferred mechanism we used fo r > :> spl()s). Then we would have an incredibly cheap mechanism for accessi ng > :> per-cpu caches (like per-cpu mbuf freelists, for example) which could > :> further be adapted for use by zalloc[i]() and malloc(). > : > :Err, it does maintain a count right now and only does the actual change of > :interrupt state when entering and exiting critical sections at the top-level . > :sti and cli aren't very expensive though. Critical sections are more expens ive > :on other archs though. For example, it's a PAL call on alpha. Although, sp l's > :were PAL calls on alpha as well. On ia64 it's a single instruction but one that > :requires a stop. > > Most of the critical section code in the system is going to be at > the top level. sti and cli *ARE* expensive, that's why the original > spl code went to such great lengths to avoid calling them. I > believe one or both instructions synchronizes the cpu pipeline, > interrupting instruction flow. It is nasty stuff. Not quite.. We went to extremes to avoid touching the ISA PIC, since that meant going over the 4.77Mhz (or 8Mhz) isa bus.. potentially taking many hundreds of cpu clock ticks per inb/outb. cli/sti is nothing compared to that. The spl code optimized the mask updates on the PIC, not cli/sti as such. The APIC is worse since it could take 40+ IO operations to set the complete mask, although thankfully at FSB speeds (not cpu core speed), not ISA speed. Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 21:36:10 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 31F0937B417; Wed, 2 Jan 2002 21:36:08 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g035Zq062246; Wed, 2 Jan 2002 21:35:52 -0800 (PST) (envelope-from dillon) Date: Wed, 2 Jan 2002 21:35:52 -0800 (PST) From: Matthew Dillon Message-Id: <200201030535.g035Zq062246@apollo.backplane.com> To: Peter Wemm Cc: John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <20020103045849.5B53039EC@overcee.netplex.com.au> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> Most of the critical section code in the system is going to be at :> the top level. sti and cli *ARE* expensive, that's why the original :> spl code went to such great lengths to avoid calling them. I :> believe one or both instructions synchronizes the cpu pipeline, :> interrupting instruction flow. It is nasty stuff. : :Not quite.. We went to extremes to avoid touching the ISA PIC, since that :meant going over the 4.77Mhz (or 8Mhz) isa bus.. potentially taking many :hundreds of cpu clock ticks per inb/outb. cli/sti is nothing compared to :that. The spl code optimized the mask updates on the PIC, not cli/sti :as such. The APIC is worse since it could take 40+ IO operations to set :the complete mask, although thankfully at FSB speeds (not :cpu core speed), not ISA speed. : :Cheers, :-Peter :-- :Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au An ISA or PCI bus access is certainly expensive, but I don't see how it applies to spl*() calls. The interrupt handler assembly has to manipulate the controller no matter how spl*()'s are implemented and that is where most of the optimizations were made (things like AUTO_EOI and such). Since spl*() calls manipulate multiple interrupt sources I don't think anyone would ever consider actually trying to screw around with the PIC in spl*(), even if the PIC had been fast. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 23: 2:24 2002 Delivered-To: freebsd-arch@freebsd.org Received: from niwun.pair.com (niwun.pair.com [209.68.2.70]) by hub.freebsd.org (Postfix) with SMTP id E892837B41B for ; Wed, 2 Jan 2002 23:02:21 -0800 (PST) Received: (qmail 63466 invoked by uid 3193); 3 Jan 2002 07:02:20 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 3 Jan 2002 07:02:20 -0000 Date: Thu, 3 Jan 2002 02:02:20 -0500 (EST) From: Mike Silbersack X-Sender: To: Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 2 Jan 2002, Mike Silbersack wrote: > This commit log started me wondering about DELAY usage in the kernel. > Looking around, we do have a decent amount of such calls. Most of the > calls seem to be for 100 us or less, with some as short a 1 us. These > times seem short, but looking at sys/i386/isa/clock.c, I see that we time > based off the i8254 timer chip, rather than the processor's TSC. As such, > are we actually reaching microsecond accuracy, or is the delay actually > taking longer than expected in many of these cases? > > Thanks, > > Mike "Silby" Silbersack To answer my own question, there seems to be about 8us of slop added to each call to DELAY. This seems irrelevant for calls of 100 or 1000, but changes timing quite a bit on calls between 1 and 10. So, it looks like rewriting DELAY so that it spin-waits on the TSC for delays of less than 100 might be useful. Maybe when I get some time... Mike "Silby" Silbersack To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Wed Jan 2 23:56:37 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id 5CC2037B419 for ; Wed, 2 Jan 2002 23:56:34 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id SAA15536; Thu, 3 Jan 2002 18:56:28 +1100 Date: Thu, 3 Jan 2002 20:07:52 +1100 (EST) From: Bruce Evans X-X-Sender: To: Mike Silbersack Cc: Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: Message-ID: <20020103194429.T15755-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Mike Silbersack wrote: > On Wed, 2 Jan 2002, Mike Silbersack wrote: > > > This commit log started me wondering about DELAY usage in the kernel. > > Looking around, we do have a decent amount of such calls. Most of the > > calls seem to be for 100 us or less, with some as short a 1 us. These > > times seem short, but looking at sys/i386/isa/clock.c, I see that we time > > based off the i8254 timer chip, rather than the processor's TSC. As such, > > are we actually reaching microsecond accuracy, or is the delay actually > > taking longer than expected in many of these cases? > > To answer my own question, there seems to be about 8us of slop added to > each call to DELAY. This is very machine-dependent. On i386's, DELAY(1) reads an i8254 counter twice. This takes a minimum of about 2.5 usec per call. The general overhead of DELAY() is quite large on slow machines (IIRC, DELAY(1) takes about 15 usec on a 486/33, half for reading the counter and half for general overhead). > This seems irrelevant for calls of 100 or 1000, but > changes timing quite a bit on calls between 1 and 10. So, it looks like > rewriting DELAY so that it spin-waits on the TSC for delays of less than > 100 might be useful. Maybe when I get some time... Sorry, this wouldn't be useful. As for sleep(3), system activity may lengthen the delay by an indeterminate amount, unless you disable interrupts, which would be bad. Slow machines may take longer than 1 usec just to call DELAY(). Code should be written to not depend on DELAY() being very accurate. OTOH, the i386 DELAY() could be written better using microuptime(undoc). It would then be much simpler (except possibly for complications to make it work at boot time), and more accurate (except for small intervals on slow machines). Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 1:49:54 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 8097037B405; Thu, 3 Jan 2002 01:49:50 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g039nJm19351; Thu, 3 Jan 2002 10:49:19 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g039gjtx049101; Thu, 3 Jan 2002 10:42:45 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g039gjW13904; Thu, 3 Jan 2002 10:42:45 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g039gd763659; Thu, 3 Jan 2002 10:42:39 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 10:42:39 +0100 From: Bernd Walter To: Terry Lambert Cc: Matthew Dillon , Alfred Perlstein , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103094239.GH53199@cicely9.cicely.de> References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> <200201030012.g030Cgp60752@apollo.backplane.com> <3C33D3E3.38E758CA@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3C33D3E3.38E758CA@mindspring.com> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 07:45:39PM -0800, Terry Lambert wrote: > Matthew Dillon wrote: > > If we standardize the mutex used by interface device-driver interrupts > > (if it isn't already done), the collector could obtain the mutex when > > reading the counter, yes? > > IMO, statistics are snapshots in any case, so it doesn't > really matter if they are incredibly accurate on read. > > In any case, multiple reader, single write doesn't need locks > unless it has to be done atomically. Single write means always with the same CPU or thread. If you always write with the same CPU the other might keep an old value in his cache for an unknown long time. You simply trust that caches gets syncronised automaticaly by regular events like context switching. I'm not shure this is valid. > An easy way to do this atomically, even for fairly large items, > is to toggle between an active and inactive structure of data, > where the pointer to the strucutre is assigned atomically. Only you use rel/acq behavour - atomic alone isn't strong enough. > I used this property for my zero system call gettimeofday(), > for example. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 1:59:38 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id B119F37B416; Thu, 3 Jan 2002 01:59:34 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g039xJc19506; Thu, 3 Jan 2002 10:59:19 +0100 (CET) (envelope-from ticso@cicely9.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g039sZtx049179; Thu, 3 Jan 2002 10:54:35 +0100 (CET)?g (envelope-from ticso@cicely9.cicely.de) Received: from cicely9.cicely.de (cicely9.cicely.de [10.1.7.11]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g039sZW13921; Thu, 3 Jan 2002 10:54:35 +0100 (CET) Received: (from ticso@localhost) by cicely9.cicely.de (8.11.6/8.11.6) id g039sYu63685; Thu, 3 Jan 2002 10:54:34 +0100 (CET) (envelope-from ticso) Date: Thu, 3 Jan 2002 10:54:34 +0100 From: Bernd Walter To: Terry Lambert Cc: Matthew Dillon , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020103095433.GI53199@cicely9.cicely.de> References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020103003214.GC53199@cicely9.cicely.de> <3C33D580.50B5BCAA@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3C33D580.50B5BCAA@mindspring.com> User-Agent: Mutt/1.3.24i X-Operating-System: FreeBSD cicely9.cicely.de 5.0-CURRENT alpha Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, Jan 02, 2002 at 07:52:32PM -0800, Terry Lambert wrote: > Bernd Walter wrote: > > You need to hold the mutex while writing and reading. > > If you hold the mutex only while writing another CPU might still use > > old cached values. > > > Unless there are two sounts that MUST remain synchornized for > correct operation, you don't *care* if someone gets the stale > value. > > Ask yourself: what's the worst case failure scenario that would > result? If I ask a value I may get a recent value x. If I ask with another CPU later I may get an older value than x. Having slightly out of date statistisks isn't a problem, but statistiks getting backwards definately are. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 2:25:48 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc52.attbi.com (rwcrmhc52.attbi.com [216.148.227.88]) by hub.freebsd.org (Postfix) with ESMTP id 4F5CB37B419; Thu, 3 Jan 2002 02:25:44 -0800 (PST) Received: from peter3.wemm.org ([12.232.27.13]) by rwcrmhc52.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020103102543.LJHP6450.rwcrmhc52.attbi.com@peter3.wemm.org>; Thu, 3 Jan 2002 10:25:43 +0000 Received: from overcee.netplex.com.au (overcee.wemm.org [10.0.0.3]) by peter3.wemm.org (8.11.0/8.11.0) with ESMTP id g03APhs32262; Thu, 3 Jan 2002 02:25:43 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by overcee.netplex.com.au (Postfix) with ESMTP id 7737039EC; Thu, 3 Jan 2002 02:25:43 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Matthew Dillon Cc: John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <200201030535.g035Zq062246@apollo.backplane.com> Date: Thu, 03 Jan 2002 02:25:43 -0800 From: Peter Wemm Message-Id: <20020103102543.7737039EC@overcee.netplex.com.au> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Matthew Dillon wrote: > :> Most of the critical section code in the system is going to be at > :> the top level. sti and cli *ARE* expensive, that's why the original > :> spl code went to such great lengths to avoid calling them. I > :> believe one or both instructions synchronizes the cpu pipeline, > :> interrupting instruction flow. It is nasty stuff. > : > :Not quite.. We went to extremes to avoid touching the ISA PIC, since that > :meant going over the 4.77Mhz (or 8Mhz) isa bus.. potentially taking many > :hundreds of cpu clock ticks per inb/outb. cli/sti is nothing compared to > :that. The spl code optimized the mask updates on the PIC, not cli/sti > :as such. The APIC is worse since it could take 40+ IO operations to set > :the complete mask, although thankfully at FSB speeds (not > :cpu core speed), not ISA speed. > : > :Cheers, > :-Peter > :-- > :Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au > > An ISA or PCI bus access is certainly expensive, but I don't see how > it applies to spl*() calls. The original spl*() implementation that we inherited caused every spl call to change the pic masks. BSDI only implemented lazy masking around BSD/OS 3.0 or so. I'm not sure exactly when bde's implementation went in (it may have got into 386bsd before the fork, or it may have been via the patchkit). We get away with simple assignment in RELENG_4 because we absolutely lock all reentrancy that can touch cpl etc with the giant mp_lock. All spl* calls are inside the giant kernel lock. Anyway, this is heading off on a tangent, but my point was the the objective of the original implementation (that we have a descendant of in RELENG_4) was to stop the spl* functions from hitting hardware, much more so than to minimize cli/sti usage. It just so happens that the implementation didn't need cli/sti very much. Places where those are used hasn't changed very much over time (ignoring the *horrible* things that were done when the APIC support was added). > The interrupt handler assembly has to > manipulate the controller no matter how spl*()'s are implemented > and that is where most of the optimizations were made (things like > AUTO_EOI and such). Since spl*() calls manipulate multiple interrupt > sources I don't think anyone would ever consider actually trying to > screw around with the PIC in spl*(), even if the PIC had been fast. It certainly used to. Many SYSV/386 commercial OS's used to up until (relatively) recently as well. This is going well off topic though. Incidently, probably 90%+ of freebsd boxes (all those that run GENERIC or similar) are essentially wire-oring the interrupt masks together due to the slip/ppp drivers in the kernel. On most of them, splanything() pretty much masks all interrupts. Check tty_imask, net_imask, and bio_imask and see for yourself (and check cambio/camnet as well). We *almost* have a boolean "interrupts on or off" state on most of these systems (not quite but almost). Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 3: 2:12 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id 9C10037B41C; Thu, 3 Jan 2002 03:02:08 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id WAA30288; Thu, 3 Jan 2002 22:01:58 +1100 Date: Thu, 3 Jan 2002 23:13:23 +1100 (EST) From: Bruce Evans X-X-Sender: To: Peter Wemm Cc: Matthew Dillon , John Baldwin , , Bernd Walter , Mike Smith , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <20020103102543.7737039EC@overcee.netplex.com.au> Message-ID: <20020103224754.G16354-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Peter Wemm wrote: > Incidently, probably 90%+ of freebsd boxes (all those that run GENERIC or > similar) are essentially wire-oring the interrupt masks together due to the > slip/ppp drivers in the kernel. On most of them, splanything() pretty much > masks all interrupts. Check tty_imask, net_imask, and bio_imask and see > for yourself (and check cambio/camnet as well). We *almost* have a boolean Er, I think someone named peter fixed this so that it only happens if slip/ppp is actually used. Only RELENG_3 still has the compile-time wiring for slip. > "interrupts on or off" state on most of these systems (not quite but > almost). Almost all except clock and fast interrupts. This may be best (at least for UP). It's more efficient, and strict interrupt prioritisation is rarely important. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 3:42:43 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (esplanaden.cybercity.dk [212.242.40.114]) by hub.freebsd.org (Postfix) with ESMTP id 4AE8137B405 for ; Thu, 3 Jan 2002 03:42:40 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.11.6/8.11.6) with ESMTP id g039Sv600833; Thu, 3 Jan 2002 10:28:57 +0100 (CET) (envelope-from phk@critter.freebsd.dk) To: Bruce Evans Cc: Mike Silbersack , freebsd-arch@FreeBSD.ORG Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: Your message of "Thu, 03 Jan 2002 20:07:52 +1100." <20020103194429.T15755-100000@gamplex.bde.org> Date: Thu, 03 Jan 2002 10:28:57 +0100 Message-ID: <831.1010050137@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020103194429.T15755-100000@gamplex.bde.org>, Bruce Evans writes: >Code should be written to not depend on >DELAY() being very accurate. OTOH, the i386 DELAY() could be written >better using microuptime(undoc). It would then be much simpler (except >possibly for complications to make it work at boot time), and more >accurate (except for small intervals on slow machines). I have actually thought about DELAY() a fair bit. I agree that code shouldn't depend too much on the accuracy of DELAY() but on the other hand I think we can do much better than we do today. Obviously, nanosleep() will need a MD part for short delays, but long delays can be handled MI in timecounter land, since the timecounters have already hold of the hardware. On the other hand, nanosleep() would mostly be for very short intervals, and the changes that for instance the TSC might experience are minor compared to the interval. Summary: a) A lot more can be done to improve things. b) Not doing so properly discourages people from using it. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 3:43:37 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (esplanaden.cybercity.dk [212.242.40.114]) by hub.freebsd.org (Postfix) with ESMTP id 600EB37B405; Thu, 3 Jan 2002 03:43:28 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.11.6/8.11.6) with ESMTP id g039Gr600685; Thu, 3 Jan 2002 10:16:53 +0100 (CET) (envelope-from phk@critter.freebsd.dk) To: Josef Karthauser Cc: Bosko Milekic , freebsd-arch@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG Subject: Re: SMPng: Interrupt Latency Issues In-Reply-To: Your message of "Wed, 02 Jan 2002 19:02:53 GMT." <20020102190253.B47550@tao.org.uk> Date: Thu, 03 Jan 2002 10:16:53 +0100 Message-ID: <683.1010049413@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020102190253.B47550@tao.org.uk>, Josef Karthauser writes: >> It has become obvious, recently in particular, that some important >> improvements are required in the way we take interrupts in -CURRENT >> SMPng. As previously mentionned, we are experiencing lousy interrupt >> latency in -CURRENT. This comes as no surprise. > >I'm interested in helping with with, although my experience in this >area is zero ;(. I was waiting with bated breath for answers to >this email and am a bit disappointed that no-one's replied. I hope >that doesn't mean that it's not important to anyone. (I was hoping >to learn something :) I think it takes more than a few days to answer meaningfully to something that abstract (I'm sure our local superhero will jump in now and explain how he solved all of this back in the mid eighties on his Amiga :-) There are basically two things to getting good interrupt latency: 1. An architecture making it possible. 2. Care in coding all over the place. I think we are on track to #1. My main contribution will probably be in trying to establish a metrology to document the result, but to the extent time allows I will participate and help where I can. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 3:59:41 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by hub.freebsd.org (Postfix) with ESMTP id 1454537B417; Thu, 3 Jan 2002 03:59:38 -0800 (PST) Received: from peter3.wemm.org ([12.232.27.13]) by rwcrmhc51.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020103115937.WLND20119.rwcrmhc51.attbi.com@peter3.wemm.org>; Thu, 3 Jan 2002 11:59:37 +0000 Received: from overcee.netplex.com.au (overcee.wemm.org [10.0.0.3]) by peter3.wemm.org (8.11.0/8.11.0) with ESMTP id g03Bxbs32611; Thu, 3 Jan 2002 03:59:37 -0800 (PST) (envelope-from peter@wemm.org) Received: from wemm.org (localhost [127.0.0.1]) by overcee.netplex.com.au (Postfix) with ESMTP id 3CBFB38CC; Thu, 3 Jan 2002 03:59:37 -0800 (PST) (envelope-from peter@wemm.org) X-Mailer: exmh version 2.5 07/13/2001 with nmh-1.0.4 To: Bruce Evans Cc: Matthew Dillon , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <20020103224754.G16354-100000@gamplex.bde.org> Date: Thu, 03 Jan 2002 03:59:37 -0800 From: Peter Wemm Message-Id: <20020103115937.3CBFB38CC@overcee.netplex.com.au> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bruce Evans wrote: > On Thu, 3 Jan 2002, Peter Wemm wrote: > > > Incidently, probably 90%+ of freebsd boxes (all those that run GENERIC or > > similar) are essentially wire-oring the interrupt masks together due to the > > slip/ppp drivers in the kernel. On most of them, splanything() pretty much > > masks all interrupts. Check tty_imask, net_imask, and bio_imask and see > > for yourself (and check cambio/camnet as well). We *almost* have a boolean > > Er, I think someone named peter fixed this so that it only happens if > slip/ppp is actually used. Only RELENG_3 still has the compile-time > wiring for slip. I just moved them to the modules themselves. slip does it at first use, but ppp still does it at boot (or load). I thought plip did it too, but it seems to ignore the net/tty problem entirely... :-/ Cheers, -Peter -- Peter Wemm - peter@FreeBSD.org; peter@yahoo-inc.com; peter@netplex.com.au "All of this is for nothing if we don't go to the stars" - JMS/B5 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 4:23:56 2002 Delivered-To: freebsd-arch@freebsd.org Received: from anchor-post-33.mail.demon.net (anchor-post-33.mail.demon.net [194.217.242.91]) by hub.freebsd.org (Postfix) with ESMTP id 1DA1C37B416; Thu, 3 Jan 2002 04:23:52 -0800 (PST) Received: from [62.49.251.130] (helo=herring.nlsystems.com) by anchor-post-33.mail.demon.net with esmtp (Exim 2.12 #1) id 16M6uU-000BmM-0X; Thu, 3 Jan 2002 12:23:50 +0000 Received: from herring (herring [10.0.0.2]) by herring.nlsystems.com (8.11.2/8.11.2) with ESMTP id g03CMY970532; Thu, 3 Jan 2002 12:22:34 GMT (envelope-from dfr@nlsystems.com) Date: Thu, 3 Jan 2002 12:22:34 +0000 (GMT) From: Doug Rabson To: Matthew Dillon Cc: John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <200201030158.g031wWD61364@apollo.backplane.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wed, 2 Jan 2002, Matthew Dillon wrote: > :> section is dropped. If an interrupt occurs on the cpu while the current > :> thread is in a critical section, it should be deferred just like we > :> did with _cpl in our spl*() code. > : > :Have you even looked at the current implementation? :) > > Yes, I have John. Perhaps you didn't read the part of my email > that indicated most of the critical sections we have in the system > are top-level sections. Optimizing the nesting is all well and fine > but it isn't going to improve peformance. > > Let me put this in perspective, to show you how twisted the code > has become. > > * Because preemption can cause a thread to resume on a different cpu, > the per-cpu area pointer is not consistent and per-cpu variables > must be accessed atomically with the pointer. Even per-cpu variables must often be accessed atomically. The original reason for adding the atomic_ apis was to make updates to counters and flags variables safe in the presence of interrupts (not SMP at all). Non x86 architectures cannot update a memory location with a single instruction and it was necessary to use the atomic_ apis to protect against interrupts on a single cpu. Later on, even x86 needed this when the compiler changed to use read-modify-write sequences for updates to memory location. Whatever you do for SMP and per-cpu stuff, you will never be able to safely write '++*p' on non-x86 architectures. -- Doug Rabson Mail: dfr@nlsystems.com Phone: +44 20 8348 6160 To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 5:25:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mongrel.pacific.net.au (mongrel.pacific.net.au [61.8.0.107]) by hub.freebsd.org (Postfix) with ESMTP id 2FA0D37B405; Thu, 3 Jan 2002 05:25:54 -0800 (PST) Received: from dungeon.home (ppp175.dyn248.pacific.net.au [203.143.248.175]) by mongrel.pacific.net.au (8.9.3/8.9.3/Debian 8.9.3-21) with ESMTP id AAA15400; Fri, 4 Jan 2002 00:19:45 +1100 X-Authentication-Warning: mongrel.pacific.net.au: Host ppp175.dyn248.pacific.net.au [203.143.248.175] claimed to be dungeon.home Received: from dungeon.home (localhost [127.0.0.1]) by dungeon.home (8.11.3/8.11.1) with ESMTP id g03DTub12339; Thu, 3 Jan 2002 23:29:56 +1000 (EST) (envelope-from mckay) Message-Id: <200201031329.g03DTub12339@dungeon.home> To: Jordan Hubbard Cc: John Baldwin , re@FreeBSD.ORG, arch@FreeBSD.ORG, Alfred Perlstein , Murray Stokely , mckay@thehub.com.au Subject: Re: xfree4 by default? References: <88560.1010014865@winston.freebsd.org> In-Reply-To: <88560.1010014865@winston.freebsd.org> from Jordan Hubbard at "Wed, 02 Jan 2002 15:41:05 -0800" Date: Thu, 03 Jan 2002 23:29:56 +1000 From: Stephen McKay Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Wednesday, 2nd January 2002, Jordan Hubbard wrote: >Sorry, we weren't talking about 4.5, we were talking about satisfying the >"let's cut -stable over immediately after release" request. Well, actually, I was talking about 4.5. But if Mr Price can't flip a switch and pull a rabbit out of a hat, then we have plenty of time to make it work for 4.6. I can't imagine hanging out until 5.0. Stephen. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 5:30:59 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id E7BF237B41C; Thu, 3 Jan 2002 05:30:56 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 2CC8A14C53; Thu, 3 Jan 2002 14:30:55 +0100 (CET) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: Alfred Perlstein Cc: re@freebsd.org, arch@freebsd.org Subject: Re: xfree4 by default? References: <20011231174113.O16101@elvis.mu.org> From: Dag-Erling Smorgrav Date: 03 Jan 2002 14:30:54 +0100 In-Reply-To: <20011231174113.O16101@elvis.mu.org> Message-ID: Lines: 12 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Alfred Perlstein writes: > What is the proceedure one must follow to present xfree4 as the new > default? It's been around for a long time and support a LOT more > chipsets a lot better. Can we go ahead and just pull some switch > or are there more sinister issues involved? A lot of ports (particularly those that depend on Mesa) don't work with XFree86 4. DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 5:38:34 2002 Delivered-To: freebsd-arch@freebsd.org Received: from gw.nectar.cc (gw.nectar.cc [208.42.49.153]) by hub.freebsd.org (Postfix) with ESMTP id A4AB837B41D; Thu, 3 Jan 2002 05:38:28 -0800 (PST) Received: from madman.nectar.cc (madman.nectar.cc [10.0.1.111]) by gw.nectar.cc (Postfix) with ESMTP id 19A0734; Thu, 3 Jan 2002 07:38:28 -0600 (CST) Received: (from nectar@localhost) by madman.nectar.cc (8.11.6/8.11.6) id g03DcQd27768; Thu, 3 Jan 2002 07:38:26 -0600 (CST) (envelope-from nectar) Date: Thu, 3 Jan 2002 07:38:26 -0600 From: "Jacques A. Vidrine" To: Dag-Erling Smorgrav Cc: Alfred Perlstein , re@freebsd.org, arch@freebsd.org Subject: Re: xfree4 by default? Message-ID: <20020103133826.GA21628@madman.nectar.cc> References: <20011231174113.O16101@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.25i X-Url: http://www.nectar.cc/ Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 02:30:54PM +0100, Dag-Erling Smorgrav wrote: > A lot of ports (particularly those that depend on Mesa) don't work > with XFree86 4. Really? I haven't noticed any problems, other than some xscreensaver things not working (and I've never looked to see why that was). % pkg_info -I XFree86\* XFree86-4.1.0_10 X11R6.4/XFree86 core distribution (complete) % pkg_info -I Mesa\* Mesa-3.4.2_2 A graphics library similar to SGI's OpenGL % pkg_info -I `cat /var/db/pkg/Mesa-3.4.2_2/+REQUIRED_BY` avifile-0.60.20010920 AVI player/converter with numerous codecs, including MPEG-4 gle-3.0.3 A GL Tubing and Extrusion Library glimmer-1.0.8 A full featured code editor for GNOME desktop with many adv gnome-1.4.1b2_1 The "meta-port" for the GNOME integrated X11 desktop gnome-fifth-toe-1.4.1b2 The "meta-port" for the GNOME "Fifth-Toe" extra package set gtkglarea-1.2.2 An OpenGL widget for the GTK+ GUI toolkit mplayer-0.50.0.2_1 High performance media player that supports many formats py-gnome-1.4.1 A set of Python bindings for GNOME py-gtk-0.6.8 A set of Python bindings for GTK qt-2.3.1 A C++ X GUI toolkit sdl-1.2.3 Cross-platform multi-media development API (developm. vers. smpeg-0.4.4 A free MPEG1 video player library with sound support vlc-0.2.90 An X11 MPEG2 client/server solution xmame-0.53.1 UNIX/X11 port of the Multi Arcade Machine Emulator (MAME) xscreensaver-3.34 save your screen while you entertain your cat -- Jacques A. Vidrine http://www.nectar.cc/ NTT/Verio SME . FreeBSD UNIX . Heimdal Kerberos jvidrine@verio.net . nectar@FreeBSD.org . nectar@kth.se To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 6: 4:44 2002 Delivered-To: freebsd-arch@freebsd.org Received: from flood.ping.uio.no (flood.ping.uio.no [129.240.78.31]) by hub.freebsd.org (Postfix) with ESMTP id DB9AD37B417; Thu, 3 Jan 2002 06:04:41 -0800 (PST) Received: by flood.ping.uio.no (Postfix, from userid 2602) id 90C5114C53; Thu, 3 Jan 2002 15:04:40 +0100 (CET) X-URL: http://www.ofug.org/~des/ X-Disclaimer: The views expressed in this message do not necessarily coincide with those of any organisation or company with which I am or have been affiliated. To: "Jacques A. Vidrine" Cc: Alfred Perlstein , re@freebsd.org, arch@freebsd.org Subject: Re: xfree4 by default? References: <20011231174113.O16101@elvis.mu.org> <20020103133826.GA21628@madman.nectar.cc> From: Dag-Erling Smorgrav Date: 03 Jan 2002 15:04:39 +0100 In-Reply-To: <20020103133826.GA21628@madman.nectar.cc> Message-ID: Lines: 9 User-Agent: Gnus/5.0808 (Gnus v5.8.8) Emacs/21.1 MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG "Jacques A. Vidrine" writes: > Really? I haven't noticed any problems, other than some xscreensaver > things not working (and I've never looked to see why that was). Tried Blender lately? DES -- Dag-Erling Smorgrav - des@ofug.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 6:10:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from gw.nectar.cc (gw.nectar.cc [208.42.49.153]) by hub.freebsd.org (Postfix) with ESMTP id 4A78137B416; Thu, 3 Jan 2002 06:10:29 -0800 (PST) Received: from madman.nectar.cc (madman.nectar.cc [10.0.1.111]) by gw.nectar.cc (Postfix) with ESMTP id CA95059; Thu, 3 Jan 2002 08:10:28 -0600 (CST) Received: (from nectar@localhost) by madman.nectar.cc (8.11.6/8.11.6) id g03EASC37101; Thu, 3 Jan 2002 08:10:28 -0600 (CST) (envelope-from nectar) Date: Thu, 3 Jan 2002 08:10:28 -0600 From: "Jacques A. Vidrine" To: Dag-Erling Smorgrav Cc: Alfred Perlstein , re@freebsd.org, arch@freebsd.org Subject: Re: xfree4 by default? Message-ID: <20020103141028.GA37073@madman.nectar.cc> References: <20011231174113.O16101@elvis.mu.org> <20020103133826.GA21628@madman.nectar.cc> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.3.25i X-Url: http://www.nectar.cc/ Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 03:04:39PM +0100, Dag-Erling Smorgrav wrote: > "Jacques A. Vidrine" writes: > > Really? I haven't noticed any problems, other than some xscreensaver > > things not working (and I've never looked to see why that was). > > Tried Blender lately? No, I can't say I have. However, `Blender not working XFree86 4'with is a bit different from `lots of ports don't work with XFree86 4'. The latter doesn't appear to be true. I've been using XFree86 4 since it was XFree86 3.9.15 (two years or so?) without running into troubles with the ports I use (which are fairly numerous). Cheers, -- Jacques A. Vidrine http://www.nectar.cc/ NTT/Verio SME . FreeBSD UNIX . Heimdal Kerberos jvidrine@verio.net . nectar@FreeBSD.org . nectar@kth.se To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 9:28:30 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by hub.freebsd.org (Postfix) with ESMTP id 0AF9E37B41C for ; Thu, 3 Jan 2002 09:28:22 -0800 (PST) Received: (qmail 6245 invoked from network); 3 Jan 2002 17:28:20 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 17:28:20 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201031329.g03DTub12339@dungeon.home> Date: Thu, 03 Jan 2002 09:28:06 -0800 (PST) From: John Baldwin To: Stephen McKay Subject: Re: xfree4 by default? Cc: Murray Stokely , Alfred Perlstein , arch@FreeBSD.ORG, re@FreeBSD.ORG, Jordan Hubbard Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Stephen McKay wrote: > On Wednesday, 2nd January 2002, Jordan Hubbard wrote: > >>Sorry, we weren't talking about 4.5, we were talking about satisfying the >>"let's cut -stable over immediately after release" request. > > Well, actually, I was talking about 4.5. But if Mr Price can't flip a > switch and pull a rabbit out of a hat, then we have plenty of time > to make it work for 4.6. I can't imagine hanging out until 5.0. It's not just ports. sysinstall would also need work and changing sysinstall this close to release has a bad history. :) > Stephen. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 9:28:42 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail5.speakeasy.net (mail5.speakeasy.net [216.254.0.205]) by hub.freebsd.org (Postfix) with ESMTP id 8B35837B417 for ; Thu, 3 Jan 2002 09:28:24 -0800 (PST) Received: (qmail 6273 invoked from network); 3 Jan 2002 17:28:22 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail5.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 17:28:22 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201030410.g034AkP61865@apollo.backplane.com> Date: Thu, 03 Jan 2002 09:28:09 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Matthew Dillon wrote: >:> non-cacheable in terms of code generation. Every time you get it you >:> have to re-obtain it. So a great deal of our code does this in order >:> to work around the lack of optimization: >: >:No, this is incorrect. curthread and curpcb are a bit special here. They >:are >:valid even across migration. > > Yes they are... but the code that retrieves them can't be optimized by > GCC to deal with multiple references. That's one of the problems. To be honest, I find 'td' more readable then 'curthread' so for me this isn't that big of a problem. Are you saying you want the compiler to automatically cache the value in a temporary variable on its own? >:> ... use td multiple times ... >: >:This is done cause PCPU_GET() is slow. Perhaps that is what you are trying >:to >:say in that our PCPU_GET() macros are slow? On x86 they are one instruction >:for most cases, and on other archs they are 2 or 3 at most. > > No, what I am saying is that they are designed in a manner that > makes it pretty much impossible for GCC or a human to optimize. The > fact that you cannot get a stable pointer to the per-cpu area (without > being in a critical section) and then access the variables via standard > structural indirection has led to all manner of bloat in the API. > It turns what ought to be a simple matter of a cpu-relative access > into a mess of macros and makes it difficult to take advantage of > the per-cpu data area for anything general purpose. Err, PCPU_PTR returns a pointer. The fact that you have to be in a critical section is a separate issue due to the fact that we allow threads preempted by an interrupt to migrate. Well, there are two choices as far as allowing migration as I see it, either 'critical_exit()' is a migration point as well as any setrunqueue()'s, or you just make setrunqueue()'s migration points and pin a thread when it is preempted by critical_exit(). > If we had conceptually 'instant' critical section handling it would not > be as big a deal, but we have a bad combination here... critical sections > are expensive (almost as expensive as getting a mutex), and it is not > possible to optimize general per-cpu structural references. If we had cold fusion life would be peachy keen as well. :-P > The function/macroness is not the biggest problem here. It's > the cli/sti junk and the general overhead in the code that is the > problem. > What should be an extremely light weight function that nobody should have > a second thought about using when they need it isn't even close to > being light weight. Our critical section code could be two > (unsynchronized) instructions on nearly all architectures with > only a moderate (and very well known since it would be similar to > the SPL mechanism) amount of assembly. > > We have the same problem with these functions that we had with the spl*() > code before we optimized it. People were uncomfortable with the routines > and the API and often did all sorts of weird things to try to avoid > them due to their perceived overhead. We had (and still have) > unnecessarily long spl'd loops in -stable because of that. The spl code was only optimized on i386. It wasn't on alpha. However, you are blatantly ignoring one issue that I've brought up several times here: if you don't disable interrupts in a critical section, then you can't use locks in the bottom half of the kernel. However, we _need_ to use the icu_lock to disable an interrupt source. Also, if you want to allow other CPU's to snoop interrupts from other CPU's then you need a lock for that. Which must be acquired when writing the data, not just when the other CPU reads it. >:Fair enough. I suppose we could use a per-thread bitmask of pending >:interrupts. >:Hmm, except that that assumes a dense region of vectors which alpha doesn't >:have. Note that on the alpha we did spl's via the x86 equivalent of cli/sti >:for this reason. On the alpha vector numbers are rather sparse and won't >:fit into a bitmask. You can't use an embedded list in struct ithd in case >:multiple CPU's get the same interrupt in a critical section. Also, now we >:have >:the problem that we can't use the spin lock to protect the ICU on SMP >:systems, >:meaning that we will corrupt the ICU masks now when disabling interrupts >:(which >:we _have_ to do if interrupts are level-triggered as they are for PCI >:interrupts are on alpha). > > interrupt code == interrupts disabled on entry usually, which means > you can create a queue in the per-cpu data area of pending interrupts > which are to be executed when the critical section is left. And, I will > also add, that this also greatly improves the flexibility we have in > dealing with interrupts. It would even be possible to have an idle cpu > check an active cpu's pending interrupt queue and steal one. But when I exit the interrupt and return back to the kernel interrupts are enabled again. Combine this with a level-triggered interrupt and you get a deadlock. One idea buzzing around my head is to allow one interrupt in a critical section (save's an ithread pointer in the thread) and then disable interrupts in the trapframe so when we return we run with interrupts disabled until we exit the outermost critical_exit(). This could work as it would allow us to defer the ICU lock as well. This does mean that mulitple CPU's might schedule the same ithread (or run the same fast interrupt handler) but I think we will be fine with that. Humm, except that fast interrupt handlers want a trapframe to work with (at least clock interrupts do) so this would break clock interrupts. Eww. This we could work around by special saving a mini-clockframe (that just has the bits of a trapframe that a clock interrupt wants to actually use) along with the the ithread pointer and passing it to fast interrup handlers. Basically a intrframe which is much smaller than a trapframe. OTOH, we could just save a full trapframe though that's a bit large IMO but it would work for the first round without overly complicating things. >:I'm afraid the issues here are a bit more complex than it first seems. >:However, the idea of deferring interrupts is a nice one if we can do it. It >:is >:actually very similar to the way preemption will/would defer switches to >:real-time threads until the outermost critical_exit(). > > Of course we can do it. It's how we do SPLs in -stable after all. It > is a very well known and well proven mechanism. This is how you do SPL on _i386_ in stable. >:Err, I was going to use separate procedures precisely to avoid having to >:complicate critical_enter/exit. critical_enter/exit_spinlock or some such >:which would only be used in kern_mutex.c and sys/mutex.h in one place in each >:file. > > Yuch. It is a simple matter to standardize the meaning of > td->td_critnest but the full implementation of 'critical_enter' and > 'critical_exit' should not be split up into MD and MI parts. It just > makes the code less readable. They should simply be MD. Incrementing a counter is MD? :) They are split up like this because critical_exit() is going to grow some more MI code for preemption, and it is going to grow some more MI code for this optimization as well. >:> I don't understand what you are saying here. The algorithm I described >:> is platform independant (just as our spl mechanism was). What more can >:> one ask? It's great.. an API that everyone can depend on performing >:> well, exactly as advertised, across a multitude of platforms. >: >:No, I think your algorithm is very i386 specific to be honest. The spl >:mechanisms differend widely on i386 and alpha. It may be that this code >:really >:needs to be MD as the spl code was instead of MI to optimize it, but we don't >:need it right now. Your argument is that people will not use critical_enter >:because of its implementation details, and I'm saying that people should use >:the API because of the service it presents and should not be worried about >:how >:it is implemented. > > The basic _cpl concept was and still is platform independant. The > concept of 'defering an interrupt' is certainly platform independant. > I don't see how this could be any less independant. Err, the optimization of deferring interrupts is very i386 specific. The original spl's when spl's were first used disabled lower-priority interrupts on the PDP. We do the same on the Alpha. It's equivalent to cli/sti except _much_ more expensive. Each spl on the alpha involved 1 or 2 PAL calls. The way you are approaching this is very i386 specific. >:the icu lock, and the problem of figuring a way to store the per-thread state >:of triggered interrupts for each arch. If there is a MI way, that would be >:best. Well, the embedded list of ithreads might work actually. But that >:doesn't fix icu_lock. > > These are all problems we faced before. I had to deal with all of > these issues when I did that first pass on the 386 code when we first > started to scrap the _cpl stuff, incorporate Giant, and implement > the idle thread. Uh. The i386 code on SMP used a mplock that it _disabled interrupts_ with in order work around the problems I've brought up. Using cli/sti or their equivalents is how we handled these problems before. You weren't crying foul then. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 11:20:34 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by hub.freebsd.org (Postfix) with ESMTP id 32F4E37B617; Thu, 3 Jan 2002 11:20:20 -0800 (PST) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc51.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020103192019.HYWF20119.rwcrmhc51.attbi.com@InterJet.elischer.org>; Thu, 3 Jan 2002 19:20:19 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id LAA24295; Thu, 3 Jan 2002 11:11:58 -0800 (PST) Date: Thu, 3 Jan 2002 11:11:57 -0800 (PST) From: Julian Elischer To: Terry Lambert Cc: Matthew Dillon , John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <3C33DA68.8E9700D4@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG It's interresti gthat the major antagonists here are all withon 70 km (Terry are you still in Foster City?) why not just get together with a whiteboard somewhere (here?) and hack out the locking strategies over a few cups of coffee? To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 11:55:54 2002 Delivered-To: freebsd-arch@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 9574437B405 for ; Thu, 3 Jan 2002 11:55:50 -0800 (PST) Received: from pool0238.cvx40-bradley.dialup.earthlink.net ([216.244.42.238] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16MDxr-0002W0-00; Thu, 03 Jan 2002 11:55:48 -0800 Message-ID: <3C34B746.85DF32E6@mindspring.com> Date: Thu, 03 Jan 2002 11:55:50 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Mike Silbersack Cc: freebsd-arch@freebsd.org Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Mike Silbersack wrote: > To answer my own question, there seems to be about 8us of slop added to > each call to DELAY. This seems irrelevant for calls of 100 or 1000, but > changes timing quite a bit on calls between 1 and 10. So, it looks like > rewriting DELAY so that it spin-waits on the TSC for delays of less than > 100 might be useful. Maybe when I get some time... It's common practice in software to have timer intervals that mean "at *least* X intervals", and which have a "slop" of 1-3 full intervals, in order to: 1) Read a timer 2) Wait until the timer changes, in order to find an interval boundary in the hardware 3) Read the timer until the delta specified by the user (rounded up to the timer resolution) has passed This would easily explain the "slop" you are seeing. I think what you want is PC hardware to have higher resolution timer hardware that starts calculating the delta from when it's programmed, and causes an interrupt when the delta interval has completed. As it is, PC hardware has some incredibly *low* resolution timers that can interrupt, and then it has *counters*, which you can read. You should consider your application, and probably change the design. The best approach I've been able to come up with for timers is to create reschedulable one-shots that operate using "opportunistic timers" (search Yahoo for this phrase to find the white papers from Rice University in Texas), that use a high resolution counter (the cycle counter). The problem with this is that DELAY() is used a lot during boot and for device probes (actually, you could run some of these in parallel to get a faster boot, if they were using one-shots instead of delay timers), and you have a nice chicken-and-egg problem getting the other clocks synchronized. Probably, the "timecounters" should be the ver first things that are initialized in the OS, after the VM is started, and before the devices are probed; this would be the most flexible approach (as well as hard to implement correctly, unfortunately). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12: 2:27 2002 Delivered-To: freebsd-arch@freebsd.org Received: from swan.prod.itd.earthlink.net (swan.mail.pas.earthlink.net [207.217.120.123]) by hub.freebsd.org (Postfix) with ESMTP id 5CE7037B41D for ; Thu, 3 Jan 2002 12:02:13 -0800 (PST) Received: from pool0238.cvx40-bradley.dialup.earthlink.net ([216.244.42.238] helo=mindspring.com) by swan.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16ME3t-0003nI-00; Thu, 03 Jan 2002 12:02:01 -0800 Message-ID: <3C34B8BB.BAF3DB2B@mindspring.com> Date: Thu, 03 Jan 2002 12:02:03 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Poul-Henning Kamp Cc: Bruce Evans , Mike Silbersack , freebsd-arch@FreeBSD.ORG Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c References: <831.1010050137@critter.freebsd.dk> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Poul-Henning Kamp wrote: > I have actually thought about DELAY() a fair bit. Go figure. 8-). (Poul is our timecounter guru). > I agree that code shouldn't depend too much on the accuracy of DELAY() > but on the other hand I think we can do much better than we do today. This is the number one take-away: depend *only* on the DELAY() being *AT LEAST* as long as the specified interval. > Obviously, nanosleep() will need a MD part for short delays, but long > delays can be handled MI in timecounter land, since the timecounters > have already hold of the hardware. > > On the other hand, nanosleep() would mostly be for very short intervals, > and the changes that for instance the TSC might experience are minor > compared to the interval. Actually, this misses the middle case, where you want a longer interval, but its termination point still requires very high resolution (yes, I know that "+/-3uS after 6 hours" would be weird, but "+/-3uS" after some interval a couple of orders of magnitude larger tha "1uS" does make sense -- e.g. floppy disk and other cruddy hardware). > Summary: > a) A lot more can be done to improve things. > b) Not doing so properly discourages people from using it. Both good things... 8-). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12: 2:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id CC4A337B438; Thu, 3 Jan 2002 12:02:05 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g03K25g72318; Thu, 3 Jan 2002 12:02:05 -0800 (PST) (envelope-from dillon) Date: Thu, 3 Jan 2002 12:02:05 -0800 (PST) From: Matthew Dillon Message-Id: <200201032002.g03K25g72318@apollo.backplane.com> To: John Baldwin Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :To be honest, I find 'td' more readable then 'curthread' so for me this isn't :that big of a problem. Are you saying you want the compiler to automatically :cache the value in a temporary variable on its own? It sure would be nice. We were able to do that for curproc in -stable before the page table changes. :> No, what I am saying is that they are designed in a manner that :> makes it pretty much impossible for GCC or a human to optimize. The :> fact that you cannot get a stable pointer to the per-cpu area (without :> being in a critical section) and then access the variables via standard :> structural indirection has led to all manner of bloat in the API. :> It turns what ought to be a simple matter of a cpu-relative access :> into a mess of macros and makes it difficult to take advantage of :> the per-cpu data area for anything general purpose. : :Err, PCPU_PTR returns a pointer. The fact that you have to be in a critical :section is a separate issue due to the fact that we allow threads preempted by :an interrupt to migrate. Well, there are two choices as far as allowing :migration as I see it, either 'critical_exit()' is a migration point as well as :any setrunqueue()'s, or you just make setrunqueue()'s migration points and pin :a thread when it is preempted by critical_exit(). Maybe I am not being clear. If critical_enter() and critical_exit() were considered 'very fast' (aka two instructions inlined for critical_enter), then not otherwise having a stable per-cpu area is just fine with me. But at the moment critical_enter() and critical_exit() are too heavy weight. They are almost as bad as mutex overhead, in fact. If critical_enter/exit are going to remain slow then I want the per-cpu area to be stable. :... :> We have the same problem with these functions that we had with the spl*() :> code before we optimized it. People were uncomfortable with the routines :> and the API and often did all sorts of weird things to try to avoid :> them due to their perceived overhead. We had (and still have) :> unnecessarily long spl'd loops in -stable because of that. : :The spl code was only optimized on i386. It wasn't on alpha. However, you are :blatantly ignoring one issue that I've brought up several times here: if you :don't disable interrupts in a critical section, then you can't use locks in the :bottom half of the kernel. However, we _need_ to use the icu_lock to disable :an interrupt source. Also, if you want to allow other CPU's to snoop :interrupts from other CPU's then you need a lock for that. Which must be :acquired when writing the data, not just when the other CPU reads it. The spl code for the alpha was never finished. It was a hack from the start and, gee, it's still a hack now. That is no excuse for turning everything into the lowest common denominator. When an interrupt actually *occurs*, interrupts are disabled by the processor. So I don't see why there would be any problem obtaining the ICU lock inside the interrupt itself. Again, it's exactly the same problem the spl*() code has - that part of the interrupt runs outside Giant in -stable too you know. :... :> also add, that this also greatly improves the flexibility we have in :> dealing with interrupts. It would even be possible to have an idle cpu :> check an active cpu's pending interrupt queue and steal one. : :But when I exit the interrupt and return back to the kernel interrupts are :enabled again. Combine this with a level-triggered interrupt and you get a :deadlock. One idea buzzing around my head is to allow one interrupt in a :critical section (save's an ithread pointer in the thread) and then disable :interrupts in the trapframe so when we return we run with interrupts disabled :until we exit the outermost critical_exit(). This could work as it would allow :us to defer the ICU lock as well. This does mean that mulitple CPU's might :schedule the same ithread (or run the same fast interrupt handler) but I think :we will be fine with that. Humm, except that fast interrupt handlers want a :trapframe to work with (at least clock interrupts do) so this would break clock :interrupts. Eww. This we could work around by special saving a Leaving interrupts disabled on the iret is a trick that has been used (or at least discussed) before. It would work, but again I don't think it is necessary. The interrupt can be masked in the ICU while in the interrupt assembly, just like we do in the SPL code. :>:Err, I was going to use separate procedures precisely to avoid having to :>:complicate critical_enter/exit. critical_enter/exit_spinlock or some such :>:which would only be used in kern_mutex.c and sys/mutex.h in one place in each :>:file. I still don't see why you have play with critical_enter/exit inside the spin code. All you need to do is check to see if any interrupts have been queued. If one has been and you are inside a critical section, there is nothing preventing the spin code from scheduling the deferred interrupts right there and calling setrunqueue(). :> Yuch. It is a simple matter to standardize the meaning of :> td->td_critnest but the full implementation of 'critical_enter' and :> 'critical_exit' should not be split up into MD and MI parts. It just :> makes the code less readable. They should simply be MD. : :Incrementing a counter is MD? :) : :They are split up like this because critical_exit() is going to grow some :more MI code for preemption, and it is going to grow some more MI code for this :optimization as well. They about a dozen lines of code, splitting them up into four files makes no sense. It just makes it difficult to follow the code. If the routines were larger then, sure, I can see splitting them. But they aren't. They are *tiny*, and you have strewn them over four (or is it five?) files when they should only be in two. You have reduced the potential performance of the code by introducing a conditional that some architectures (e.g. i386) can do without. Again, I am willing to clean all this stuff up. :... :>:how :>:it is implemented. :> :> The basic _cpl concept was and still is platform independant. The :> concept of 'defering an interrupt' is certainly platform independant. :> I don't see how this could be any less independant. : :Err, the optimization of deferring interrupts is very i386 specific. The :original spl's when spl's were first used disabled lower-priority interrupts on :the PDP. We do the same on the Alpha. It's equivalent to cli/sti except :_much_ more expensive. Each spl on the alpha involved 1 or 2 PAL calls. The :way you are approaching this is very i386 specific. I disagree. You can implement the same concept on Alpha too if you want. I am not an Alpha assembly programmer so I can't do it, but I sure as hell can fix i386. If someone cares about the Alpha enough there is nothing stopping them from doing thet same optimization for the alpha. Again, the 'lowest common denominator' API concept is just plain a bad idea. APIs have to be designed to be able to take advantage of the strengths of the underlying architectures, not make it impossible to optimize to those strengths. :> started to scrap the _cpl stuff, incorporate Giant, and implement :> the idle thread. : :Uh. The i386 code on SMP used a mplock that it _disabled interrupts_ with in :order work around the problems I've brought up. Using cli/sti or their :equivalents is how we handled these problems before. You weren't crying foul :then. : :-- : :John Baldwin <>< http://www.FreeBSD.org/~jhb/ An mplock is just fine *IF* you don't have to screw around with it in the critical path. The actual interrupt processing code can afford to be slightly heavier weight if it can turn all the spl*() (or critical_enter/exit) routines into extremely light weight entities. If critical_enter() is just ++td->td_critcount or whatever it's called, and critical_exit() is something like: static __inline critical_exit() { ... if (--td->td_critcount == 0 && td->td_critsignal) _critical_exit(); /* more complex processing */ } Then we get the best of both worlds. We can nice, fast, streamlined inlines that nobody will have to think twice about using, and we get incredible flexibility. For something like the mutex spin we do: if (td->td_critsignal) critical_process_signals(); Or whatever we want to call it, which will be responsible for scheduling any queued interrupts and calling setrunqueue(), for example. -Matt Matthew Dillon To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12:25:14 2002 Delivered-To: freebsd-arch@freebsd.org Received: from netau1.alcanet.com.au (ntp.alcanet.com.au [203.62.196.27]) by hub.freebsd.org (Postfix) with ESMTP id 673BC37B41B; Thu, 3 Jan 2002 12:25:05 -0800 (PST) Received: from mfg1.cim.alcatel.com.au (mfg1.cim.alcatel.com.au [139.188.23.1]) by netau1.alcanet.com.au (8.9.3 (PHNE_22672)/8.9.3) with ESMTP id HAA05408; Fri, 4 Jan 2002 07:24:51 +1100 (EDT) Received: from gsmx07.alcatel.com.au by cim.alcatel.com.au (PMDF V5.2-32 #37641) with ESMTP id <01KCNYAYNN4WVFLXOA@cim.alcatel.com.au>; Fri, 4 Jan 2002 07:23:57 +1100 Received: (from jeremyp@localhost) by gsmx07.alcatel.com.au (8.11.6/8.11.6) id g03KOnn80856; Fri, 04 Jan 2002 07:24:49 +1100 Content-return: prohibited Date: Fri, 04 Jan 2002 07:24:48 +1100 From: Peter Jeremy Subject: Re: SMPng: Interrupt Latency Issues In-reply-to: <683.1010049413@critter.freebsd.dk>; from phk@critter.freebsd.dk on Thu, Jan 03, 2002 at 10:16:53AM +0100 To: Poul-Henning Kamp Cc: freebsd-arch@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG Mail-Followup-To: Poul-Henning Kamp , freebsd-arch@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG Message-id: <20020104072448.I561@gsmx07.alcatel.com.au> MIME-version: 1.0 Content-type: text/plain; charset=us-ascii Content-disposition: inline User-Agent: Mutt/1.2.5i Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 2002-Jan-03 10:16:53 +0100, Poul-Henning Kamp wrote: >There are basically two things to getting good interrupt latency: > > 1. An architecture making it possible. ... >I think we are on track to #1. Does this mean FreeBSD is dropping support for PC-compatible hardware? :-) (For that matter, PALcode also worsens interrupt latency on the Alpha). >My main contribution will probably be in trying to establish a >metrology to document the result, but to the extent time allows I >will participate and help where I can. This needs to be cross-platform. We also need a way for developers to measure (at least approximately) interrupt latency without needing USD1K timer boards - otherwise they can't assess the impact of any changes they make without involving you. For example, if you're using an i8254 as the master clock, you can get a reasonable estimate of interrupt latency by reading the i8254 counter at the start of hardclock(). This has a (fixed) overhead of a couple of usec, a resolution of ~840nsec and can only measure clock interrupt latency in specific (though common) configurations. A more general approach approach is to read the TSC as early as possible in the interrupt process and then again at the beginning of the device-level interrupt handler. This suffers from the vagaries of TSC frequency and can't measure the delay from interrupt assertion to the beginning of the software vector, but has reasonable resolution and will work on any interrupt and most CPUs. Both these approaches are probably "Good Enuf" for our purposes. I don't believe the object is a hard real-time kernel so a few usec slop in the interrupt latency probably doesn't matter. We just need to avoid latency spikes in the msec region. As for hardware approaches: A dedicated timer card like the PCI Pamette is obviously the best, but most pricey, solution. Maybe one of the hardware gurus would like to design and manufacture a simple PCI card that is capable of generating an interrupt and lets you read a 32-bit time-since-interrupt counter. (Or something a bit fancier that generates an interrupt, accepts an acknowledge signal and keeps track of the minimum/maximum/mean/median/variance internally). Both these approaches can also be done more cheaply and less accurately via the parallel port. The oscilloscope approach will let you measure average latencies but, unless you have a fancy digital CRO or logic analyser, isn't going to measure outliers - which are at least as important. Peter To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12:27: 4 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail12.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by hub.freebsd.org (Postfix) with ESMTP id A21CD37B41B for ; Thu, 3 Jan 2002 12:26:22 -0800 (PST) Received: (qmail 9216 invoked from network); 3 Jan 2002 20:26:20 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail12.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 20:26:20 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <200201032002.g03K25g72318@apollo.backplane.com> Date: Thu, 03 Jan 2002 12:26:07 -0800 (PST) From: John Baldwin To: Matthew Dillon Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Matthew Dillon wrote: >:The spl code was only optimized on i386. It wasn't on alpha. However, you >:are >:blatantly ignoring one issue that I've brought up several times here: if you >:don't disable interrupts in a critical section, then you can't use locks in >:the >:bottom half of the kernel. However, we _need_ to use the icu_lock to disable >:an interrupt source. Also, if you want to allow other CPU's to snoop >:interrupts from other CPU's then you need a lock for that. Which must be >:acquired when writing the data, not just when the other CPU reads it. > > The spl code for the alpha was never finished. It was a hack from the > start and, gee, it's still a hack now. That is no excuse for turning > everything into the lowest common denominator. We need to make sure we don't design thigns such that we have to make other archs look like a 386 to work though. > When an interrupt actually *occurs*, interrupts are disabled by the > processor. So I don't see why there would be any problem obtaining > the ICU lock inside the interrupt itself. Again, it's exactly the same > problem the spl*() code has - that part of the interrupt runs outside > Giant in -stable too you know. Err, what if the top-half code you are interrupting already holds the ICU lock. What then? You either panic cause the lock isn't recursive, or you corrupt your ICU if you let the lock recurse. >:think >:we will be fine with that. Humm, except that fast interrupt handlers want a >:trapframe to work with (at least clock interrupts do) so this would break >:clock >:interrupts. Eww. This we could work around by special saving a > > Leaving interrupts disabled on the iret is a trick that has been used > (or at least discussed) before. It would work, but again I don't think > it is necessary. The interrupt can be masked in the ICU while in the > interrupt assembly, just like we do in the SPL code. Not if we have to defer the ICU stuff because we can't use the ICU lock in the low-level code. If we have to defer the ICU stuff and have a level-triggered interrupt we will deadlock. Doing a lazy disabling of interrupts on the first interrupt in a critical section would work around this problem while avoiding the cost of disable interrupts except when the race condition occurs. >:>:Err, I was going to use separate procedures precisely to avoid having to >:>:complicate critical_enter/exit. critical_enter/exit_spinlock or some such >:>:which would only be used in kern_mutex.c and sys/mutex.h in one place in >:>:each >:>:file. > > I still don't see why you have play with critical_enter/exit inside > the spin code. All you need to do is check to see if any interrupts > have been queued. If one has been and you are inside a critical section, > there is nothing preventing the spin code from scheduling the deferred > interrupts right there and calling setrunqueue(). *sigh* Nevermind. This idea was to only disable interrupts for spinlock enter/exits and just increment the counter for other enter/exits. >:> Yuch. It is a simple matter to standardize the meaning of >:> td->td_critnest but the full implementation of 'critical_enter' and >:> 'critical_exit' should not be split up into MD and MI parts. It just >:> makes the code less readable. They should simply be MD. >: >:Incrementing a counter is MD? :) >: >:They are split up like this because critical_exit() is going to grow some >:more MI code for preemption, and it is going to grow some more MI code for >:this >:optimization as well. > > They about a dozen lines of code, splitting them up into four files > makes no sense. It just makes it difficult to follow the code. > > If the routines were larger then, sure, I can see splitting them. > But they aren't. They are *tiny*, and you have strewn them over four > (or is it five?) files when they should only be in two. You have > reduced the potential performance of the code by introducing a > conditional that some architectures (e.g. i386) can do without. Read what I said. The routines will get larger. Also, they are spread across 2 files right now: kern/kern_switch.c and machine/cpufunc.h. >:>:how >:>:it is implemented. >:> >:> The basic _cpl concept was and still is platform independant. The >:> concept of 'defering an interrupt' is certainly platform independant. >:> I don't see how this could be any less independant. >: >:Err, the optimization of deferring interrupts is very i386 specific. The >:original spl's when spl's were first used disabled lower-priority interrupts >:on >:the PDP. We do the same on the Alpha. It's equivalent to cli/sti except >:_much_ more expensive. Each spl on the alpha involved 1 or 2 PAL calls. The >:way you are approaching this is very i386 specific. > > I disagree. You can implement the same concept on Alpha too if you want. > I am not an Alpha assembly programmer so I can't do it, but I sure as > hell can fix i386. If someone cares about the Alpha enough there is > nothing stopping them from doing thet same optimization for the alpha. You are missing the point. You are looking at the world as if all machines work like a 386. They _don't_. When you design things, you have to take into account other architectures. > Again, the 'lowest common denominator' API concept is just plain a bad > idea. APIs have to be designed to be able to take advantage of the > strengths of the underlying architectures, not make it impossible to > optimize to those strengths. That comes from someone who only works on cares about one arch. :) Anyways, I think we aren't completely opposed to each other. Lazy disabling of interrupts would work and wouldn't really cost much since by the time we do it we have already taken an interrupt. The lazy interrupt disabling would also work on all the platforms we currently support and I think should work on ones in the future. It also doesn't need any locks in the bottom half which is a critical feature. Note that not having locks does place some restrictions: we can't use a list embedded in struct ithd since bad things can happen if the same interrupt goes to two CPU's at the same time when both are in critical sections. Tbus, we allow one interrupt then leave interrupts disabled until we exit the section. Secondly, CPU's can't steal pending interrupts from each other since that would need a lock. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12:40:28 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 96E9537B405; Thu, 3 Jan 2002 12:40:24 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g03KeOP72634; Thu, 3 Jan 2002 12:40:24 -0800 (PST) (envelope-from dillon) Date: Thu, 3 Jan 2002 12:40:24 -0800 (PST) From: Matthew Dillon Message-Id: <200201032040.g03KeOP72634@apollo.backplane.com> To: John Baldwin Cc: arch@FreeBSD.org, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG :> When an interrupt actually *occurs*, interrupts are disabled by the :> processor. So I don't see why there would be any problem obtaining :> the ICU lock inside the interrupt itself. Again, it's exactly the same :> problem the spl*() code has - that part of the interrupt runs outside :> Giant in -stable too you know. : :Err, what if the top-half code you are interrupting already holds the ICU lock. :What then? You either panic cause the lock isn't recursive, or you corrupt :your ICU if you let the lock recurse. Maybe you need to give me an example of a possible real-life scenario. I just don't see the problem. ICU manipulation does not occur in the normal code path, so it is no big deal to disable interrupts for real if we have to occassionally do it outside the interrupt code itself. :> reduced the potential performance of the code by introducing a :> conditional that some architectures (e.g. i386) can do without. : :Read what I said. The routines will get larger. Also, they are spread across :2 files right now: kern/kern_switch.c and machine/cpufunc.h. By about two lines of code. There is also a prototype in sys/systm.h though I suppose one could ignore that. Ok, two files... but one is MD and one is MI. The actual function is still split unnecessarily. Even my implementation would be in two files, since I would want to optimize with inlines, but they would be entirely within the MD section of the system. :> :> I disagree. You can implement the same concept on Alpha too if you want. :> I am not an Alpha assembly programmer so I can't do it, but I sure as :> hell can fix i386. If someone cares about the Alpha enough there is :> nothing stopping them from doing thet same optimization for the alpha. : :You are missing the point. You are looking at the world as if all machines :work like a 386. They _don't_. When you design things, you have to take into :account other architectures. No I am not. I am stating that the concept of a deferred interrupt works on any architecture, and that we can guarentee certain performance characteristics by designing an API around that concept. It is no different from the lowest-common-denominator concept you are using except that it is far more flexible in regards to implementation details and it can guarentee far better performance across a wider range of platforms then what you have in there now. :That comes from someone who only works on cares about one arch. :) Sigh. Look, just because I only know I386 (and 68000, and 6502, and 808x, and a few others) doesn't mean that I have my head in the ground. I am perfectly well aware of how other cpus are designed, I know how other archs work, and I am easily as qualified as you are in that regard. Just because I'm not an expert assembly programmer for three major architectures does not invalidate my opinion. -Matt Matthew Dillon :Anyways, I think we aren't completely opposed to each other. Lazy disabling of :interrupts would work and wouldn't really cost much since by the time we do it :we have already taken an interrupt. The lazy interrupt disabling would also :work on all the platforms we currently support and I think should work on ones :in the future. It also doesn't need any locks in the bottom half which is a :critical feature. Note that not having locks does place some restrictions: we :can't use a list embedded in struct ithd since bad things can happen if the :same interrupt goes to two CPU's at the same time when both are in critical :sections. Tbus, we allow one interrupt then leave interrupts disabled until we :exit the section. Secondly, CPU's can't steal pending interrupts from each :other since that would need a lock. : :-- : :John Baldwin <>< http://www.FreeBSD.org/~jhb/ :"Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12:44:21 2002 Delivered-To: freebsd-arch@freebsd.org Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by hub.freebsd.org (Postfix) with ESMTP id 2C2B337B41A; Thu, 3 Jan 2002 12:44:08 -0800 (PST) Received: from pool0238.cvx40-bradley.dialup.earthlink.net ([216.244.42.238] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16MEiB-0006kA-00; Thu, 03 Jan 2002 12:43:39 -0800 Message-ID: <3C34C27C.C7FE6713@mindspring.com> Date: Thu, 03 Jan 2002 12:43:40 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Bernd Walter Cc: Matthew Dillon , Alfred Perlstein , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> <200201030012.g030Cgp60752@apollo.backplane.com> <3C33D3E3.38E758CA@mindspring.com> <20020103094239.GH53199@cicely9.cicely.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bernd Walter wrote: > > In any case, multiple reader, single write doesn't need locks > > unless it has to be done atomically. > > Single write means always with the same CPU or thread. Yes, of course. > If you always write with the same CPU the other might keep an > old value in his cache for an unknown long time. > You simply trust that caches gets syncronised automaticaly by > regular events like context switching. > I'm not shure this is valid. Actually, I'd suggest putting it in a non-cacheable page, if multiple CPUs needed to get at it, and it must be 100% accurate at all times; if it were in a cacheable page, there would be an Invalidate (MESI, remember?), and the cache line in the other CPUs that had it cached would be flushed. Personally, I'd suggest making the algorithms insensitive to the data, in the neighborhood of 2 times the update latency. This is what I suggested to Alfred, when I suggested per CPU run queues (and three years ago, when we had the threads design meeting at the Whistle facility FreeBSD user group). --- Example application and use of variable accuracy counts in a non-locking scheduler with CPU migration, negaffinity, and variable scheduling policy --- Basically, you would not need extremely accurate load numbers in order to make CPU load balancing decisions, if you did the load balancing "correctly": 1) Per CPU run queues are created 2) Processes do not migrate themselves between run queues 3) A single "figure of merit" is kept on a per CPU basis, where other CPUs can see it; their view of this figure potentially contains latency in its accuracy 4) An infrequent (relative to the scheduling quantum) timer averages "CPUs of interest" (this permits NUMA and cluster based migration, in the future, if desired, by keeping per CPU set statistics, if desired), and stores this, also, as a "figure of merit" which is accessible to the other CPUs. It might also keep a list of M CPUs, in sorted order of lowest to highest values, to make migration CPU target selection easier. 5) When a CPU is ready to pull a task off its run queue, perhaps every N iterations of this procedure, it takes its "figure of merit" and compares it against the average "figure of merit"; if the per CPU number is not some configurable "high water mark" over the average, then nothing happens: it simply schedules the task 6) If it decides action nees to be taken, then it polls the figures of merit for its interest group. The CPU with the lowest "figure of merit" is selected; this may be mediated by negaffinity; for example, if the task is a thread in a thread group (process), it might have a 32 bit value in the process structure indicating which CPUs have threads on them currently. In that case, only a CPU without its bit set in the 32 bit value could be selected, based on relative figure of merit. 7) The "scheduler insertion queue" for the selected target CPU (a simple linked list) has the task added to it (the task is migrated). This requires a lock on the queue; note that only one other CPU is tied up in this process, in an N CPU system, and the statistical likelihood of a collision is very low. 8) Each CPU going into its scheduler looks first at its per CPU scheduler insertion queue; this look does not require a lock, in most instances, since all it is doing is a check to see if a pointer is NULL. If the pointer is not NULL, then (and only then) is a lock on the queue asserted by the CPU that "owns" it, and the element is dequeued, the lock released, and run. This is only an example algorithm; but note that it operates completely without locks, except in the migration case, and the migration case is a relatively rare event. This approach gets rid of all the Linux and other OS's *mess*, where they try to ensure CPU affinity for processes by scheduler selection: the scheduler algorithm need not be nearly as complex, in order to acieve the same benefits of negaffinity for CPUs for threads in a single process, etc. (the more you look at it, the more benefits you will see). --- The point of this example is that the issue isn't about how much or how little locks cost, but about designs that don't require locks in the common case. So it really doesn't *matter* what the costs are up front, as long as the costs are incurrect so infrequently that their amortized value is lower than the lowest cost method being hit much more frequently. In other words: Matt was right, when he said "Maybe we are looking at this all wrong ... maybe we don't need locks ..." (paraphrasing). > > An easy way to do this atomically, even for fairly large items, > > is to toggle between an active and inactive structure of data, > > where the pointer to the strucutre is assigned atomically. > > Only you use rel/acq behavour - atomic alone isn't strong enough. It depends on your cycle time on the objects. So long as the update/read cost is very small relative to the update/read frequency, you're OK. Basically, you need N copies, where the update/read frequency divided by the update/read cost is greater than N*2. For my example of a zero system call gettimeofday() and time(), using a page with PG_U set on it to contain a reflection of the timecounter data, N == 2 (the minimum N permissable). For something else, N might be a larger value, and then instead of two pointers for "current" and "previous", you are talking two pointers for a two-handed clock through a circularly linked list of structures containing the reflected data (the hands are both moved on update-forward, so (1) the data is never incorrect, and (2) the data is always one-behind during the update process itself). If N gets larger than 2, in any case, you probably ought to be reconsidering your choice of algorithms, anyway. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12:49:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id E695437B405; Thu, 3 Jan 2002 12:49:37 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.11.6/8.11.6) with ESMTP id g03KlEY01746; Thu, 3 Jan 2002 21:47:14 +0100 (CET) (envelope-from phk@critter.freebsd.dk) To: Peter Jeremy Cc: freebsd-arch@FreeBSD.ORG, freebsd-smp@FreeBSD.ORG Subject: Re: SMPng: Interrupt Latency Issues In-Reply-To: Your message of "Fri, 04 Jan 2002 07:24:48 +1100." <20020104072448.I561@gsmx07.alcatel.com.au> Date: Thu, 03 Jan 2002 21:47:14 +0100 Message-ID: <1744.1010090834@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020104072448.I561@gsmx07.alcatel.com.au>, Peter Jeremy writes: >On 2002-Jan-03 10:16:53 +0100, Poul-Henning Kamp wrote: >>There are basically two things to getting good interrupt latency: >> >> 1. An architecture making it possible. >... >>I think we are on track to #1. > >Does this mean FreeBSD is dropping support for PC-compatible hardware? :-) >(For that matter, PALcode also worsens interrupt latency on the Alpha). Not it means that we try to avoid standardizing on the lousy performance of the defacto hardware in the vain hope that future hardware designs will improve in this respect :-) The worst culprint on i386 these days is SMM/APM/ACPI. >>My main contribution will probably be in trying to establish a >>metrology to document the result, but to the extent time allows I >>will participate and help where I can. > >This needs to be cross-platform. We also need a way for developers >to measure (at least approximately) interrupt latency without needing >USD1K timer boards - otherwise they can't assess the impact of any >changes they make without involving you. Well, sorry, that is the cheapest way I know of how to do it in a way which measures the desired quality. It's a regular PCI card, so it can work in other arch's as long as they have PCI slots. >For example, if you're using an i8254 as the master clock, you can >get a reasonable estimate of interrupt latency by reading the i8254 >counter at the start of hardclock(). This has a (fixed) overhead of >a couple of usec, a resolution of ~840nsec and can only measure >clock interrupt latency in specific (though common) configurations. This proves nothing as the i8254 is integrated in the southbridge these days and as such seems to get preferential IRQ treatment. >A more general approach approach is to read the TSC as early as >possible in the interrupt process and then again at the beginning of >the device-level interrupt handler. This doesn't measure the time from the actual external event raising the interrupt line until the CPU arrives in the interrupt handler. This bit is the one which is hard to measure without special hardware. >Both these approaches are probably "Good Enuf" for our purposes. I >don't believe the object is a hard real-time kernel so a few usec >slop in the interrupt latency probably doesn't matter. We just need >to avoid latency spikes in the msec region. If we want to document our latency, this is not "Good Enuf", but for the average hacker to use as a guide it could be acceptable. >As for hardware approaches: A dedicated timer card like the PCI >Pamette is obviously the best, but most pricey, solution. Maybe one >of the hardware gurus would like to design and manufacture a simple >PCI card that is capable of generating an interrupt and lets you read >a 32-bit time-since-interrupt counter. (Or something a bit fancier >that generates an interrupt, accepts an acknowledge signal and keeps >track of the minimum/maximum/mean/median/variance internally). Both >these approaches can also be done more cheaply and less accurately via >the parallel port. The card I use is the HOT1 from www.vcc.com which is a darn bit cheaper than the Pamette. A custom card should contain: A) A binary counter of at least 24 bits running from a selectable external input or PCI bus clock. B) A latch which trigger on an external input and captures a copy of the counter. C) A chunk of RAM, optionally battery backed, for ktrace, console or "prestoserve" like use. D) Add more features as you like: serial console for remote access, output pin for toggling reset/atx-power etc etc. A modest sized FPGA chip can do all of this and more. This card would be not only an excellent test/performance tool but would also be desirable to have in production systems, increasing the market a fair bit. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 12:57:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by hub.freebsd.org (Postfix) with ESMTP id 1693D37B41B; Thu, 3 Jan 2002 12:57:34 -0800 (PST) Received: from pool0238.cvx40-bradley.dialup.earthlink.net ([216.244.42.238] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16MEvY-000292-00; Thu, 03 Jan 2002 12:57:28 -0800 Message-ID: <3C34C5B9.F73A224@mindspring.com> Date: Thu, 03 Jan 2002 12:57:29 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Bernd Walter Cc: Matthew Dillon , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020103003214.GC53199@cicely9.cicely.de> <3C33D580.50B5BCAA@mindspring.com> <20020103095433.GI53199@cicely9.cicely.de> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Bernd Walter wrote: > > Unless there are two sounts that MUST remain synchornized for > > correct operation, you don't *care* if someone gets the stale > > value. > > > > Ask yourself: what's the worst case failure scenario that would > > result? > > If I ask a value I may get a recent value x. > If I ask with another CPU later I may get an older value than x. > > Having slightly out of date statistisks isn't a problem, but > statistiks getting backwards definately are. You've added an implied constraint: that the values must be monotonically increasing when examined at two different times between two different CPUs. I'd say your statistics should be stored in a non-cacheable page. This is sure to make you popular. 8-). Really, what matters is *what are you using them for?*; this is what will dictate the properties they need to have; perhaps the property you think they need is not really the property they need? The place where I see this being an issue, for example, is in the interrupt counts by interrupt. I would argue that these counts should be kept by card instance by driver, and not by interrupt, anyway. Be redefining the problem, we end up with a statistic that does not have the problems you cite. If someone needs a "by interrupt" statistic, then it can be obtained at the time it is requested, rather than maintained as a strict count on its own. Thus the penalty of the count synchronization overhead is only taken when such a count is requested, rather than "at all times", by summing the count of the driver instances attached to the intterupt(s) in question. I would argue that "vmstat -i" is a very rare occurance, as far as system events go, and its means of keeping statistics are actually obfuscating which card is causing the interrupt, and that you get *better* statistics by keeping them on a per card basis than on a per intterupt basis, so it's a net win to keep the statistics a different way. -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 13: 0:33 2002 Delivered-To: freebsd-arch@freebsd.org Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by hub.freebsd.org (Postfix) with ESMTP id C641437B419; Thu, 3 Jan 2002 13:00:29 -0800 (PST) Received: from pool0238.cvx40-bradley.dialup.earthlink.net ([216.244.42.238] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16MEyI-00064T-00; Thu, 03 Jan 2002 13:00:19 -0800 Message-ID: <3C34C664.653465B0@mindspring.com> Date: Thu, 03 Jan 2002 13:00:20 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Doug Rabson Cc: Matthew Dillon , John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Doug Rabson wrote: > Whatever you do for SMP and per-cpu stuff, you will never be able to > safely write '++*p' on non-x86 architectures. Which hopefully puts a big honking nail in that coffin... -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 13: 8:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from pintail.mail.pas.earthlink.net (pintail.mail.pas.earthlink.net [207.217.120.122]) by hub.freebsd.org (Postfix) with ESMTP id 363F437B43B; Thu, 3 Jan 2002 13:08:18 -0800 (PST) Received: from pool0238.cvx40-bradley.dialup.earthlink.net ([216.244.42.238] helo=mindspring.com) by pintail.mail.pas.earthlink.net with esmtp (Exim 3.33 #1) id 16MF5y-0001Zc-00; Thu, 03 Jan 2002 13:08:14 -0800 Message-ID: <3C34C840.6754F85A@mindspring.com> Date: Thu, 03 Jan 2002 13:08:16 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Julian Elischer Cc: Matthew Dillon , John Baldwin , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Julian Elischer wrote: > It's interresti gthat the major antagonists here are all withon 70 km > (Terry are you still in Foster City?) Yes... Of course, we are all *pro*tagonists, not *an*tagonists... > why not just get together with a whiteboard somewhere (here?) > and hack out the locking strategies over a few cups of coffee? Like the whole threads thing? I'm hesitant, since so many good ideas come from outside the circle you describe; we really need better virtual presence capability. I count 4 people on the "Cc:" of your (and this) email who would not be able to attend in person, unless they are on holiday in the U.S. right now... -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 13:35:48 2002 Delivered-To: freebsd-arch@freebsd.org Received: from niwun.pair.com (niwun.pair.com [209.68.2.70]) by hub.freebsd.org (Postfix) with SMTP id 418FA37B41A for ; Thu, 3 Jan 2002 13:35:42 -0800 (PST) Received: (qmail 72591 invoked by uid 3193); 3 Jan 2002 21:35:40 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 3 Jan 2002 21:35:40 -0000 Date: Thu, 3 Jan 2002 16:35:40 -0500 (EST) From: Mike Silbersack X-Sender: To: Bruce Evans Cc: Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: <20020103194429.T15755-100000@gamplex.bde.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Bruce Evans wrote: > On Thu, 3 Jan 2002, Mike Silbersack wrote: > > > This seems irrelevant for calls of 100 or 1000, but > > changes timing quite a bit on calls between 1 and 10. So, it looks like > > rewriting DELAY so that it spin-waits on the TSC for delays of less than > > 100 might be useful. Maybe when I get some time... > > Sorry, this wouldn't be useful. As for sleep(3), system activity may > lengthen the delay by an indeterminate amount, unless you disable > interrupts, which would be bad. Slow machines may take longer than 1 > usec just to call DELAY(). Code should be written to not depend on > DELAY() being very accurate. OTOH, the i386 DELAY() could be written > better using microuptime(undoc). It would then be much simpler (except > possibly for complications to make it work at boot time), and more > accurate (except for small intervals on slow machines). > > Bruce Yeah, that does sound like a nice way to get more accuracy. I think that the boot-time considerations could be worked around by polling microuptime only if ticks > hz and using the old routine otherwise. I'm not actually concerned about DELAY() calls being accurate for the sake of the device drivers using such calls, but rather for the sake of rest of the OS; if a call to DELAY(1) in some commonly called interrupt handler is really taking 8us, that's 7us of extra interrupt latency that we could save. As for the users of large DELAY values (1000 or greater), I wonder if more drastic measures should be taken, such as changing such calls to use tsleep / timeout. It seems like a bad idea to stop the kernel for such periods of time. (Most of these usages seem to be used during device startup and error handling cases. While startup probably occurs only once, I could see how one device driver delaying for a long period of time due to an underrun/overrun could cause another device to do the same.) Mike "Silby" Silbersack To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 13:51:20 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 5641F37B419 for ; Thu, 3 Jan 2002 13:51:17 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.11.6/8.11.6) with ESMTP id g03LmoY02973; Thu, 3 Jan 2002 22:48:50 +0100 (CET) (envelope-from phk@critter.freebsd.dk) To: Mike Silbersack Cc: Bruce Evans , freebsd-arch@FreeBSD.ORG Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: Your message of "Thu, 03 Jan 2002 16:35:40 EST." Date: Thu, 03 Jan 2002 22:48:50 +0100 Message-ID: <2971.1010094530@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message , Mike Silb ersack writes: >I'm not actually concerned about DELAY() calls being accurate for the sake >of the device drivers using such calls, but rather for the sake of rest of >the OS; if a call to DELAY(1) in some commonly called interrupt handler is >really taking 8us, that's 7us of extra interrupt latency that we could >save. If we look at DELAY(1), which is a very common value, considering the typical use, I suspect it may actually be specified not for the delay as much for various "things to happen", things which might be better provoked by memory barriers or similar. Either way, in i386 I think DELAY(1) would be best implemented as inb(0x80) Arguments for DELAY of 1msec and higher should be converted to tsleep() + HZ=1000. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 14:38:11 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 6C53137B41B; Thu, 3 Jan 2002 14:38:09 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g03Mbj573076; Thu, 3 Jan 2002 14:37:45 -0800 (PST) (envelope-from dillon) Date: Thu, 3 Jan 2002 14:37:45 -0800 (PST) From: Matthew Dillon Message-Id: <200201032237.g03Mbj573076@apollo.backplane.com> To: Terry Lambert Cc: Bernd Walter , Alfred Perlstein , John Baldwin , arch@FreeBSD.ORG, Bernd Walter , Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> <200201030012.g030Cgp60752@apollo.backplane.com> <3C33D3E3.38E758CA@mindspring.com> <20020103094239.GH53199@cicely9.cicely.de> <3C34C27C.C7FE6713@mindspring.com> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I would not recommend putting anything in an uncacheable page unless you had no other choice. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 14:48:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id 5459737B419 for ; Thu, 3 Jan 2002 14:48:47 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id JAA04081; Fri, 4 Jan 2002 09:48:37 +1100 Date: Fri, 4 Jan 2002 09:49:03 +1100 (EST) From: Bruce Evans X-X-Sender: To: Poul-Henning Kamp Cc: Mike Silbersack , Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: <2971.1010094530@critter.freebsd.dk> Message-ID: <20020104094446.N18171-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Poul-Henning Kamp wrote: > If we look at DELAY(1), which is a very common value, considering > the typical use, I suspect it may actually be specified not for the > delay as much for various "things to happen", things which might be > better provoked by memory barriers or similar. > > Either way, in i386 I think DELAY(1) would be best implemented as > inb(0x80) This mistake has been made before. inb(0x80) is too fast on some machines. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 14:53:37 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id 69A9237B405 for ; Thu, 3 Jan 2002 14:53:34 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id JAA04543; Fri, 4 Jan 2002 09:53:27 +1100 Date: Fri, 4 Jan 2002 09:53:52 +1100 (EST) From: Bruce Evans X-X-Sender: To: Poul-Henning Kamp Cc: Mike Silbersack , Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: <831.1010050137@critter.freebsd.dk> Message-ID: <20020104094951.K18194-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Poul-Henning Kamp wrote: > I agree that code shouldn't depend too much on the accuracy of DELAY() > but on the other hand I think we can do much better than we do today. > > Obviously, nanosleep() will need a MD part for short delays, but long > delays can be handled MI in timecounter land, since the timecounters > have already hold of the hardware. > > On the other hand, nanosleep() would mostly be for very short intervals, > and the changes that for instance the TSC might experience are minor > compared to the interval. > > Summary: > a) A lot more can be done to improve things. > b) Not doing so properly discourages people from using it. It is usually a mistake to use it, so nothing (apart from deleting it) should be done to improve it. The same hardware speedups that allow DELAY(1) to be implemented relatively accurately have made 1 usec a relatively long time. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 14:57:42 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mail12.speakeasy.net (mail12.speakeasy.net [216.254.0.212]) by hub.freebsd.org (Postfix) with ESMTP id 5EDFC37B405 for ; Thu, 3 Jan 2002 14:57:19 -0800 (PST) Received: (qmail 24967 invoked from network); 3 Jan 2002 22:57:17 -0000 Received: from unknown (HELO laptop.baldwin.cx) ([64.81.54.73]) (envelope-sender ) by mail12.speakeasy.net (qmail-ldap-1.03) with SMTP for ; 3 Jan 2002 22:57:17 -0000 Message-ID: X-Mailer: XFMail 1.4.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <3C34C27C.C7FE6713@mindspring.com> Date: Thu, 03 Jan 2002 14:57:04 -0800 (PST) From: John Baldwin To: Terry Lambert Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG, Alfred Perlstein , Matthew Dillon , Bernd Walter Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 03-Jan-02 Terry Lambert wrote: > --- > Example application and use of variable accuracy counts in > a non-locking scheduler with CPU migration, negaffinity, > and variable scheduling policy > --- > > Basically, you would not need extremely accurate load numbers > in order to make CPU load balancing decisions, if you did the > load balancing "correctly": > > 1) Per CPU run queues are created > 2) Processes do not migrate themselves between run queues > 3) A single "figure of merit" is kept on a per CPU basis, > where other CPUs can see it; their view of this figure > potentially contains latency in its accuracy > 4) An infrequent (relative to the scheduling quantum) > timer averages "CPUs of interest" (this permits NUMA > and cluster based migration, in the future, if desired, > by keeping per CPU set statistics, if desired), and > stores this, also, as a "figure of merit" which is > accessible to the other CPUs. It might also keep a > list of M CPUs, in sorted order of lowest to highest > values, to make migration CPU target selection easier. > 5) When a CPU is ready to pull a task off its run queue, > perhaps every N iterations of this procedure, it takes > its "figure of merit" and compares it against the average > "figure of merit"; if the per CPU number is not some > configurable "high water mark" over the average, then > nothing happens: it simply schedules the task > 6) If it decides action nees to be taken, then it polls > the figures of merit for its interest group. The CPU > with the lowest "figure of merit" is selected; this > may be mediated by negaffinity; for example, if the > task is a thread in a thread group (process), it might > have a 32 bit value in the process structure indicating > which CPUs have threads on them currently. In that case, > only a CPU without its bit set in the 32 bit value could > be selected, based on relative figure of merit. > 7) The "scheduler insertion queue" for the selected target > CPU (a simple linked list) has the task added to it (the > task is migrated). This requires a lock on the queue; > note that only one other CPU is tied up in this process, > in an N CPU system, and the statistical likelihood of a > collision is very low. > 8) Each CPU going into its scheduler looks first at its per > CPU scheduler insertion queue; this look does not require > a lock, in most instances, since all it is doing is a > check to see if a pointer is NULL. If the pointer is not > NULL, then (and only then) is a lock on the queue > asserted by the CPU that "owns" it, and the element is > dequeued, the lock released, and run. Actually, if another CPU can write to this queue whcih 7) seems to indicate, then you do need a lock. Well, if you can guarantee that the entire word is always consistent then you may be able to get away with this, but you can read a stale value (you get NULL when another CPU has written to it) and you will miss a task. I suppose that is a losable race as you will run the local task on the next switch. When you do grab a task off your local queue you _will_ need a lock however. Otherwise another CPU could be giving you a task at the same time and depending on how this worked you could end up with a task that gets "lost" and never gets to run. Or worse, you could corrupt the list. Thus, you do always need at least one lock when getting the next task to run. However, that lock may be a per-CPU lock which will be contended for less than a lock on a system-wide queue. > The point of this example is that the issue isn't about how much > or how little locks cost, but about designs that don't require > locks in the common case. Yes, optimizing is good, but getting the design right first is much more important. Optimizations can wait until you have the design closer to finalized. Pre-mature optimization can lock you into bad design decisions that you end up regretting later on. -- John Baldwin <>< http://www.FreeBSD.org/~jhb/ "Power Users Use the Power to Serve!" - http://www.FreeBSD.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 14:58:41 2002 Delivered-To: freebsd-arch@freebsd.org Received: from critter.freebsd.dk (critter.freebsd.dk [212.242.86.163]) by hub.freebsd.org (Postfix) with ESMTP id 8FF6C37B405 for ; Thu, 3 Jan 2002 14:58:38 -0800 (PST) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.11.6/8.11.6) with ESMTP id g03MuJY04586; Thu, 3 Jan 2002 23:56:19 +0100 (CET) (envelope-from phk@critter.freebsd.dk) To: Bruce Evans Cc: Mike Silbersack , freebsd-arch@FreeBSD.ORG Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: Your message of "Fri, 04 Jan 2002 09:49:03 +1100." <20020104094446.N18171-100000@gamplex.bde.org> Date: Thu, 03 Jan 2002 23:56:19 +0100 Message-ID: <4584.1010098579@critter.freebsd.dk> From: Poul-Henning Kamp Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message <20020104094446.N18171-100000@gamplex.bde.org>, Bruce Evans writes: >On Thu, 3 Jan 2002, Poul-Henning Kamp wrote: > >> If we look at DELAY(1), which is a very common value, considering >> the typical use, I suspect it may actually be specified not for the >> delay as much for various "things to happen", things which might be >> better provoked by memory barriers or similar. >> >> Either way, in i386 I think DELAY(1) would be best implemented as >> inb(0x80) > >This mistake has been made before. inb(0x80) is too fast on some machines. Are you sure ? I have yet to see a machine where 0x80 isn't routed to hardware since it is the "magic" bios-post address... -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 15:16:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from niwun.pair.com (niwun.pair.com [209.68.2.70]) by hub.freebsd.org (Postfix) with SMTP id 003B237B417 for ; Thu, 3 Jan 2002 15:16:42 -0800 (PST) Received: (qmail 95587 invoked by uid 3193); 3 Jan 2002 23:16:41 -0000 Received: from localhost (sendmail-bs@127.0.0.1) by localhost with SMTP; 3 Jan 2002 23:16:41 -0000 Date: Thu, 3 Jan 2002 18:16:41 -0500 (EST) From: Mike Silbersack X-Sender: To: Poul-Henning Kamp Cc: Bruce Evans , Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: <2971.1010094530@critter.freebsd.dk> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Poul-Henning Kamp wrote: > If we look at DELAY(1), which is a very common value, considering > the typical use, I suspect it may actually be specified not for the > delay as much for various "things to happen", things which might be > better provoked by memory barriers or similar. Sound about right. Few of the calls using multiples of ten explained why they were sleeping for the appropriate amount of time. Those with explicit purpose seems to state so (and are not multiples of ten.) > Either way, in i386 I think DELAY(1) would be best implemented as > inb(0x80) Maybe a new call could be added instead? waitforpipelineswritebackspcibuffersandstufftoclear() sounds good. > Arguments for DELAY of 1msec and higher should be converted to > tsleep() + HZ=1000. Sounds like a good junior kernel hacker task. Mike "Silby" Silbersack To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 15:20:21 2002 Delivered-To: freebsd-arch@freebsd.org Received: from apollo.backplane.com (apollo.backplane.com [216.240.41.2]) by hub.freebsd.org (Postfix) with ESMTP id 51D4437B419; Thu, 3 Jan 2002 15:20:14 -0800 (PST) Received: (from dillon@localhost) by apollo.backplane.com (8.11.6/8.9.1) id g03NKEE73347; Thu, 3 Jan 2002 15:20:14 -0800 (PST) (envelope-from dillon) Date: Thu, 3 Jan 2002 15:20:14 -0800 (PST) From: Matthew Dillon Message-Id: <200201032320.g03NKEE73347@apollo.backplane.com> To: John Baldwin Cc: Terry Lambert , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG, Alfred Perlstein , Bernd Walter Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG There are a number of ways to do queue management without the use of mutexes or locks or critical sections. The easiest is to use a fixed length FIFO with a separate read and write index. struct { struct foo *fifo[64]; int read_index; int filler[(cacheline calculation)]; int write_index; int filler[(cacheline calculation)]; } Foo; If there is only one reader and one writer, only lazy synchronization is necessary and no locks or mutexes are necessary at all. If there are multiple readers or multiple writers then it is possible to use cmpexg type instructions and still avoid any locks or mutexes, though in this case it is easier to simply use a mutex. For example, if we have a per-cpu queue of pending interrupts our interrupt handler can 'write' to the queue without any sort of synchronization, mutexes, or locks, and other (idle or in-scheduler) cpu's may compete to read from the queue by obtaining a mutex. -Matt To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 17:10:54 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id D63B837B41B for ; Thu, 3 Jan 2002 17:10:49 -0800 (PST) Received: from localhost (arr@localhost) by fledge.watson.org (8.11.6/8.11.5) with SMTP id g041Ajg54093; Thu, 3 Jan 2002 20:10:46 -0500 (EST) (envelope-from arr@FreeBSD.org) X-Authentication-Warning: fledge.watson.org: arr owned process doing -bs Date: Thu, 3 Jan 2002 20:10:44 -0500 (EST) From: "Andrew R. Reiter" X-Sender: arr@fledge.watson.org To: Terry Lambert Cc: arch@FreeBSD.org Subject: Re: When to use atomic_ functions? (was: 64 bit counters) In-Reply-To: <3C34C840.6754F85A@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG agreed. On Thu, 3 Jan 2002, Terry Lambert wrote: :Julian Elischer wrote: :> It's interresti gthat the major antagonists here are all withon 70 km :> (Terry are you still in Foster City?) : :Yes... : :Of course, we are all *pro*tagonists, not *an*tagonists... : : :> why not just get together with a whiteboard somewhere (here?) :> and hack out the locking strategies over a few cups of coffee? : :Like the whole threads thing? : :I'm hesitant, since so many good ideas come from outside the :circle you describe; we really need better virtual presence :capability. I count 4 people on the "Cc:" of your (and this) :email who would not be able to attend in person, unless they :are on holiday in the U.S. right now... : :-- Terry : :To Unsubscribe: send mail to majordomo@FreeBSD.org :with "unsubscribe freebsd-arch" in the body of the message : -- Andrew R. Reiter arr@watson.org arr@FreeBSD.org To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 17:33:55 2002 Delivered-To: freebsd-arch@freebsd.org Received: from mailman.zeta.org.au (mailman.zeta.org.au [203.26.10.16]) by hub.freebsd.org (Postfix) with ESMTP id B409C37B417 for ; Thu, 3 Jan 2002 17:33:52 -0800 (PST) Received: from bde.zeta.org.au (bde.zeta.org.au [203.2.228.102]) by mailman.zeta.org.au (8.9.3/8.8.7) with ESMTP id MAA20115; Fri, 4 Jan 2002 12:33:42 +1100 Date: Fri, 4 Jan 2002 12:34:08 +1100 (EST) From: Bruce Evans X-X-Sender: To: Poul-Henning Kamp Cc: Mike Silbersack , Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c In-Reply-To: <4584.1010098579@critter.freebsd.dk> Message-ID: <20020104122618.P18879-100000@gamplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, 3 Jan 2002, Poul-Henning Kamp wrote: > In message <20020104094446.N18171-100000@gamplex.bde.org>, Bruce Evans writes: > >On Thu, 3 Jan 2002, Poul-Henning Kamp wrote: > >> Either way, in i386 I think DELAY(1) would be best implemented as > >> inb(0x80) > > > >This mistake has been made before. inb(0x80) is too fast on some machines. > > Are you sure ? I have yet to see a machine where 0x80 isn't routed > to hardware since it is the "magic" bios-post address... I haven't seen one either, but this behaviour was reported for old machines. Perhaps it was actually for 0x84, which was used for "FASTER_NOP" in FreeBSD-1. Support for historical kludges is more standard now, so I wouldn't expect new machines to optimize this. OTOH, the timing for accesses to ordinary "ISA" ports is very machine dependent. Bruce To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 17:57:30 2002 Delivered-To: freebsd-arch@freebsd.org Received: from harrier.prod.itd.earthlink.net (harrier.mail.pas.earthlink.net [207.217.120.12]) by hub.freebsd.org (Postfix) with ESMTP id 8130D37B405; Thu, 3 Jan 2002 17:57:27 -0800 (PST) Received: from pool0602.cvx21-bradley.dialup.earthlink.net ([209.179.194.92] helo=mindspring.com) by harrier.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16MJbh-0004XD-00; Thu, 03 Jan 2002 17:57:17 -0800 Message-ID: <3C350BFB.63EC59C0@mindspring.com> Date: Thu, 03 Jan 2002 17:57:15 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: John Baldwin Cc: Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , Bernd Walter , arch@FreeBSD.ORG, Alfred Perlstein , Matthew Dillon , Bernd Walter Subject: Re: When to use atomic_ functions? (was: 64 bit counters) References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG John Baldwin wrote: > > 7) The "scheduler insertion queue" for the selected target > > CPU (a simple linked list) has the task added to it (the > > task is migrated). This requires a lock on the queue; > > note that only one other CPU is tied up in this process, > > in an N CPU system, and the statistical likelihood of a > > collision is very low. > > 8) Each CPU going into its scheduler looks first at its per > > CPU scheduler insertion queue; this look does not require > > a lock, in most instances, since all it is doing is a > > check to see if a pointer is NULL. If the pointer is not > > NULL, then (and only then) is a lock on the queue > > asserted by the CPU that "owns" it, and the element is > > dequeued, the lock released, and run. > > Actually, if another CPU can write to this queue whcih 7) seems to indicate, > then you do need a lock. Well, if you can guarantee that the entire word is > always consistent then you may be able to get away with this, but you can read > a stale value (you get NULL when another CPU has written to it) and you will > miss a task. I suppose that is a losable race as you will run the local task > on the next switch. Yes, it's a losable race. You care about presence. > When you do grab a task off your local queue you _will_ need a lock > however. Otherwise another CPU could be giving you a task at the > same time and depending on how this worked you could end up with a task that > gets "lost" and never gets to run. Or worse, you could corrupt the list. > Thus, you do always need at least one lock when getting the next task to run. > However, that lock may be a per-CPU lock which will be contended for less than > a lock on a system-wide queue. Yes. See #8... I call that lock out explicitly. And it's only contended when you are "gifted" a new task. > > The point of this example is that the issue isn't about how much > > or how little locks cost, but about designs that don't require > > locks in the common case. > > Yes, optimizing is good, but getting the design right first is much more > important. Optimizations can wait until you have the design closer to > finalized. Pre-mature optimization can lock you into bad design decisions that > you end up regretting later on. I think trying to make the locks very fast is just such a premature optimization. 8-). -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Thu Jan 3 20:38:22 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rover.village.org (rover.bsdimp.com [204.144.255.66]) by hub.freebsd.org (Postfix) with ESMTP id A49BC37B405 for ; Thu, 3 Jan 2002 20:38:19 -0800 (PST) Received: from harmony.village.org (harmony.village.org [10.0.0.6]) by rover.village.org (8.11.3/8.11.3) with ESMTP id g044cIl50736; Thu, 3 Jan 2002 21:38:18 -0700 (MST) (envelope-from imp@village.org) Received: from localhost (warner@rover2.village.org [10.0.0.1]) by harmony.village.org (8.11.6/8.11.6) with ESMTP id g044cGx04988; Thu, 3 Jan 2002 21:38:16 -0700 (MST) (envelope-from imp@village.org) Date: Thu, 03 Jan 2002 21:38:08 -0700 (MST) Message-Id: <20020103.213808.117908981.imp@village.org> To: silby@silby.com Cc: bde@zeta.org.au, freebsd-arch@FreeBSD.ORG Subject: Re: DELAY accuracy Re: cvs commit: src/sys/dev/usb uhci.c From: "M. Warner Losh" In-Reply-To: References: <20020103194429.T15755-100000@gamplex.bde.org> X-Mailer: Mew version 2.1 on Emacs 21.1 / Mule 5.0 (SAKAKI) Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG In message: Mike Silbersack writes: : As for the users of large DELAY values (1000 or greater), I wonder if more : drastic measures should be taken, such as changing such calls to use : tsleep / timeout. It seems like a bad idea to stop the kernel for such : periods of time. (Most of these usages seem to be used during device : startup and error handling cases. While startup probably occurs only : once, I could see how one device driver delaying for a long period of time : due to an underrun/overrun could cause another device to do the same.) We have a routine in our code called tsc_delay (tsc == company name, not pentium counter) that will DELAY if the interrupts aren't running yet (or if the delay is < 1hz), and tsleep if they are. Not universally useful (bad in interrupt handlers :-), but good for some code that has to run either at load time or during the boot process... Warner To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 4 1:49:36 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id 9673D37B405; Fri, 4 Jan 2002 01:49:31 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g049nIq50937; Fri, 4 Jan 2002 10:49:18 +0100 (CET) (envelope-from ticso@cicely8.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g049iGtx056967; Fri, 4 Jan 2002 10:44:17 +0100 (CET)?g (envelope-from ticso@cicely8.cicely.de) Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g049iFW15638; Fri, 4 Jan 2002 10:44:15 +0100 (CET) Received: (from ticso@localhost) by cicely8.cicely.de (8.11.6/8.11.6) id g049i8b08251; Fri, 4 Jan 2002 10:44:08 +0100 (CET) (envelope-from ticso) Date: Fri, 4 Jan 2002 10:44:07 +0100 From: Bernd Walter To: Matthew Dillon Cc: John Baldwin , Terry Lambert , Peter Jeremy , Michal Mertl , Bruce Evans , Mike Smith , arch@FreeBSD.ORG, Alfred Perlstein , Bernd Walter Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020104104407.D5294@cicely8.cicely.de> References: <200201032320.g03NKEE73347@apollo.backplane.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200201032320.g03NKEE73347@apollo.backplane.com> User-Agent: Mutt/1.3.23i X-Operating-System: FreeBSD cicely8.cicely.de 5.0-CURRENT i386 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 03:20:14PM -0800, Matthew Dillon wrote: > There are a number of ways to do queue management without the use of > mutexes or locks or critical sections. The easiest is to use a fixed > length FIFO with a separate read and write index. > > struct { > struct foo *fifo[64]; > int read_index; > int filler[(cacheline calculation)]; > int write_index; > int filler[(cacheline calculation)]; > } Foo; > > If there is only one reader and one writer, only lazy synchronization > is necessary and no locks or mutexes are necessary at all. This isn't save for some SMP architectures - e.g. alpha. CPU-A writes fifo and updates write_index. Out of your control CPU-A now pushes the write_index to the memory, but not yet the new data in fifo. CPU-B now sees the new read_index and fetches left garbadge from the fifo. Even if you use atomic_ calls for write_index this doesn't necessarily influence fifo data. You have at to put an barrier between updating fifo and write_index. And another barrier after reading read_index just before reading fifo. You also create a latency problem as you never know when the reader sees the updated write_index. If you would update write_index atomic_ you request a coherent view for the complete cacheline and the next read request is gets the last value. > If there are multiple readers or multiple writers then it is possible > to use cmpexg type instructions and still avoid any locks or mutexes, > though in this case it is easier to simply use a mutex. No. A writer does: read the write_index write the fifo using the information from the write_index update the write_index If they aren't protected we end up writing the fifo on an already used location. > For example, if we have a per-cpu queue of pending interrupts our > interrupt handler can 'write' to the queue without any sort of > synchronization, mutexes, or locks, and other (idle or in-scheduler) > cpu's may compete to read from the queue by obtaining a mutex. But you shouldn't forget the barrier in the writer, otherwise the same problem as in the first example ariese. And you still never know when another CPU actualy sees the change. But you write to the queue in interrupt context. What happens if you write to the fifo and before you wrote the index a higher priorised interrupt gets handled? Effectivly you have multiplex writes as in the example before, but can't use mutexes because you are in interrupt context. What is missing here is blocking other interrupts to use the same queue. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 4 3:19:50 2002 Delivered-To: freebsd-arch@freebsd.org Received: from srv1.cosmo-project.de (srv1.cosmo-project.de [213.83.6.106]) by hub.freebsd.org (Postfix) with ESMTP id DDF3937B405; Fri, 4 Jan 2002 03:19:43 -0800 (PST) Received: (from uucp@localhost) by srv1.cosmo-project.de (8.11.6/8.11.6) with UUCP id g04BJG452612; Fri, 4 Jan 2002 12:19:16 +0100 (CET) (envelope-from ticso@cicely8.cicely.de) Received: from mail.cicely.de (cicely20.cicely.de [10.1.1.22]) by cicely5.cicely.de (8.12.1/8.12.1) with ESMTP id g04BF6tx057433; Fri, 4 Jan 2002 12:15:07 +0100 (CET)?g (envelope-from ticso@cicely8.cicely.de) Received: from cicely8.cicely.de (cicely8.cicely.de [10.1.2.10]) by mail.cicely.de (8.11.0/8.11.0) with ESMTP id g04BF6W15714; Fri, 4 Jan 2002 12:15:06 +0100 (CET) Received: (from ticso@localhost) by cicely8.cicely.de (8.11.6/8.11.6) id g04BF1q08440; Fri, 4 Jan 2002 12:15:01 +0100 (CET) (envelope-from ticso) Date: Fri, 4 Jan 2002 12:15:01 +0100 From: Bernd Walter To: Terry Lambert Cc: Bernd Walter , Matthew Dillon , Alfred Perlstein , John Baldwin , arch@FreeBSD.ORG, Mike Smith , Bruce Evans , Michal Mertl , Peter Jeremy Subject: Re: When to use atomic_ functions? (was: 64 bit counters) Message-ID: <20020104121500.E5294@cicely8.cicely.de> References: <200201030002.g0302Eo60575@apollo.backplane.com> <20020102180734.A82406@elvis.mu.org> <200201030012.g030Cgp60752@apollo.backplane.com> <3C33D3E3.38E758CA@mindspring.com> <20020103094239.GH53199@cicely9.cicely.de> <3C34C27C.C7FE6713@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3C34C27C.C7FE6713@mindspring.com> User-Agent: Mutt/1.3.23i X-Operating-System: FreeBSD cicely8.cicely.de 5.0-CURRENT i386 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Thu, Jan 03, 2002 at 12:43:40PM -0800, Terry Lambert wrote: > Bernd Walter wrote: > > > In any case, multiple reader, single write doesn't need locks > > > unless it has to be done atomically. > > > > Single write means always with the same CPU or thread. > > Yes, of course. > > > If you always write with the same CPU the other might keep an > > old value in his cache for an unknown long time. > > You simply trust that caches gets syncronised automaticaly by > > regular events like context switching. > > I'm not shure this is valid. > > Actually, I'd suggest putting it in a non-cacheable page, if > multiple CPUs needed to get at it, and it must be 100% accurate > at all times; if it were in a cacheable page, there would be an > Invalidate (MESI, remember?), and the cache line in the other > CPUs that had it cached would be flushed. Only if you have a coherent cache design. On alphas you explicitly have to request that either based on the cacheline, what happens with ldx_l/stx_c, or globaly. Using non-cacheable is worse than using syncronisation primitives. > Personally, I'd suggest making the algorithms insensitive to > the data, in the neighborhood of 2 times the update latency. > > This is what I suggested to Alfred, when I suggested per CPU > run queues (and three years ago, when we had the threads > design meeting at the Whistle facility FreeBSD user group). I'm not commenting your example, because I'm far away from knowing the current alternative. > --- > > The point of this example is that the issue isn't about how much > or how little locks cost, but about designs that don't require > locks in the common case. > > So it really doesn't *matter* what the costs are up front, as > long as the costs are incurrect so infrequently that their > amortized value is lower than the lowest cost method being hit > much more frequently. Right - as long as the design is still correct. > In other words: Matt was right, when he said "Maybe we are looking > at this all wrong ... maybe we don't need locks ..." (paraphrasing). We have address 0x0 with initial value 1 and address 0x58 with inital value 2 Remember this scenario is possible: CPU-A write value 3 to 0x0 write value 4 to 0x58 CPU-B read value 4 from 0x58 read value 1 from 0x0 > > > An easy way to do this atomically, even for fairly large items, > > > is to toggle between an active and inactive structure of data, > > > where the pointer to the strucutre is assigned atomically. > > > > Only you use rel/acq behavour - atomic alone isn't strong enough. > > It depends on your cycle time on the objects. So long as the > update/read cost is very small relative to the update/read > frequency, you're OK. On alpha you only have special cost for the first access in a given cache-line. You savely access as often as you want as long as it's still locked in the cache without any performance loss other then strict odering of the locked/conditional instructions. What you realy want to avoid is CPU changing of that cache-line. But unless you 100% exclude another CPU accessing your values in any way you still need to care about syncronisation. > Basically, you need N copies, where the update/read frequency > divided by the update/read cost is greater than N*2. For my > example of a zero system call gettimeofday() and time(), using > a page with PG_U set on it to contain a reflection of the > timecounter data, N == 2 (the minimum N permissable). For > something else, N might be a larger value, and then instead of > two pointers for "current" and "previous", you are talking two > pointers for a two-handed clock through a circularly linked > list of structures containing the reflected data (the hands > are both moved on update-forward, so (1) the data is never > incorrect, and (2) the data is always one-behind during the > update process itself). Sorry - I can't follow you here enough to even see if it would at least work correct. -- B.Walter COSMO-Project http://www.cosmo-project.de ticso@cicely.de Usergroup info@cosmo-project.de To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 4 4:58:35 2002 Delivered-To: freebsd-arch@freebsd.org Received: from prg.traveller.cz (prg.traveller.cz [193.85.2.77]) by hub.freebsd.org (Postfix) with ESMTP id 2EB0037B417 for ; Fri, 4 Jan 2002 04:58:31 -0800 (PST) Received: from prg.traveller.cz (localhost [127.0.0.1]) by prg.traveller.cz (8.12.1[KQ-CZ](1)/8.12.1/pukvis) with ESMTP id g04CwTlk041836 for ; Fri, 4 Jan 2002 13:58:29 +0100 (CET) Received: from localhost (mime@localhost) by prg.traveller.cz (8.12.1[KQ-CZ](1)/pukvis) with ESMTP id g04CwTQR041833 for ; Fri, 4 Jan 2002 13:58:29 +0100 (CET) Date: Fri, 4 Jan 2002 13:58:29 +0100 (CET) From: Michal Mertl To: arch@freebsd.org Subject: back to STABLE 64 bit counters Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I can't say much about the recent discussions about solvning the problem on current but my focus is primarily on production machines running 4.X. I'd like to ask some questions. Talking _STABLE_ only. 4.X runs UP on i386 and Alpha. SMP is available on >=i586. Right? Current interface and protocol counters are defined as unsigned int or unsigned long (64 bit on Alpha?) and accessed with regular C operations. I won't talk much about Alpha because I don't know anything about that. But because it's 64 bit, it shouldn't make much difference if we use 64 bits instead of 32 bit AFAIK. Does i386 should work with current STABLE ipmlementationeven in SMP? If yes, why? Is it because of some big locking? I think compiler can sometimes generate read/modify/write operations for i+=j. If it is so, I don't think there will be much difference even using compiler builtin long long. Even if there was bigger chance of u_int64_t operations (e.g. addl&adcl) be interrupted I think that with small changes of values even multiple operations taking place at the same time shouldn't result in bad value (in case of addl&adcl). To ensure only small changes are happening we could have functions like counter_add(u_int64_t *, u_int32_t). I'll try to apply my patch (one using lock;cmpxchg8b) to STABLE SMP machine and see if it gets noticealy slower/will have higher cpu usage with lots of interrupts and tell you the results. If I don't see the slowdown I think we could really use atomic ops (and I suppose on UP Alpha just normal addition). If there's good chance the talks about CURRENT can lead to working implementation on STABLE in foreseeable future I don't care that much but otherwise I vote for implementing these on STABLE differently. -- Michal Mertl mime@traveller.cz To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 4 18:49:33 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc53.attbi.com (rwcrmhc53.attbi.com [204.127.198.39]) by hub.freebsd.org (Postfix) with ESMTP id F41BD37C05B for ; Fri, 4 Jan 2002 18:40:08 -0800 (PST) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc53.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020105024007.NABW20122.rwcrmhc53.attbi.com@InterJet.elischer.org> for ; Sat, 5 Jan 2002 02:40:07 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id SAA30686 for ; Fri, 4 Jan 2002 18:30:44 -0800 (PST) Date: Fri, 4 Jan 2002 18:30:43 -0800 (PST) From: Julian Elischer To: arch@freebsd.org Subject: freeing thread structures. Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Well I have the KSE kernel (with a lot of changes needed for multiple threads per process) up into single user mode. But here I hit a roadblock. It's been visible for a while but I had hoped that by the time I got here, I'd have seen an "nice" solution. When we free process structures after an exit(), we do it from the 'wait()'. There will always be one wait() for each exit(), even if it has to be done by init. But now the tricky question: where can we safely free items owned by a thread when a thread exits (including the last one freed by exit()). Here is a qhick rundown on the problem. When a thread (in the kernel) exits, the last thing it does is call thread_exit(), which calls cpu_throw() which is an alias for the second half of switch(). (It doesn't bother to save the context of the dying thread, just loads a new one). Obviously however the resources of the thread cannot be freed until the new context has been loaded, as it's still running on the old one until then. In mi_switch AFTER the switch to the new thread has been completed, (which is one possibility) the sched lock is held, so freeing any resource that wants to own a normal sleeping mutex would seem to be an instant panic. ANother possibility is to simply queue it for later freeing by another entity, but I REALLY would rather avoid having a reaper thread (or init) do it. John Baldwin has been talking abotu the possibility of puting it in a function called switch_exit() that can be called after switch has completed. Looking at that idea I think it has basically the same problem. I can't see it being able to release the sched lock. As a case in point the thread has a pointer to the processes ucred and this needs to do a crfree(). This in turn tries to get the mutex in the ucred. The thread itself is allocated from a zone (or maybe some other system in the future) which will inevitably have a mutex. In addition there is the kv space allocated to the stack and pcb, which probably will also require some such mutex safety. The question is is there some place on the outgoing side of mi_switch where we can guarantee that we do not have the sched lock? it doesn't look like it to me, but I might be wrong. Does anyone have any real cool ideas? BTW current state if I can get around this is that I have individual scheduling for threads and there is no more code in the scheduler that assumes that there is only one thread per proc. (well, very little code...). I'm about ready to add the syscalls to allow system calls to become asynchronous and to allow the upcalls when that happens. The code to actually DO the upcalls is in place as is the code to copy out the results of the async syscalls. (of course it's all untested, and some of it looks a bit "green".) To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 4 19: 8:32 2002 Delivered-To: freebsd-arch@freebsd.org Received: from elvis.mu.org (elvis.mu.org [216.33.66.196]) by hub.freebsd.org (Postfix) with ESMTP id B2E1D37B405 for ; Fri, 4 Jan 2002 19:08:29 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id 45B1A81E0F; Fri, 4 Jan 2002 21:08:29 -0600 (CST) Date: Fri, 4 Jan 2002 21:08:29 -0600 From: Alfred Perlstein To: Julian Elischer Cc: arch@freebsd.org Subject: Re: freeing thread structures. Message-ID: <20020104210829.T82406@elvis.mu.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from julian@elischer.org on Fri, Jan 04, 2002 at 06:30:43PM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Julian Elischer [020104 20:57] wrote: > > In mi_switch AFTER the switch to the new thread has been completed, > (which is one possibility) the sched lock is held, so freeing any resource > that wants to own a normal sleeping mutex would seem to be an instant > panic. ANother possibility is to simply queue it for later freeing by > another entity, but I REALLY would rather avoid having a reaper thread (or > init) do it. Please just use a reaper thread for now, it will not stop you from optimizing it later. Haven't we gone over this? -Alfred To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Fri Jan 4 22:50:57 2002 Delivered-To: freebsd-arch@freebsd.org Received: from midget.dons.net.au (daniel.lnk.telstra.net [139.130.137.70]) by hub.freebsd.org (Postfix) with ESMTP id A1F8637B41A; Fri, 4 Jan 2002 22:50:53 -0800 (PST) Received: from chowder.localdomain (root@localhost [127.0.0.1]) by midget.dons.net.au (8.11.6/8.11.6) with ESMTP id g056oiw49500; Sat, 5 Jan 2002 17:20:45 +1030 (CST) (envelope-from doconnor@gsoft.com.au) Message-ID: X-Mailer: XFMail 1.5.0 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <20011231174113.O16101@elvis.mu.org> Date: Sat, 05 Jan 2002 17:20:44 +1030 (CST) From: "Daniel O'Connor" To: Alfred Perlstein Subject: RE: xfree4 by default? Cc: arch@freebsd.org, re@freebsd.org Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On 31-Dec-2001 Alfred Perlstein wrote: > What is the proceedure one must follow to present xfree4 as the new > default? It's been around for a long time and support a LOT more > chipsets a lot better. Can we go ahead and just pull some switch > or are there more sinister issues involved? What about the nice graphical configurator? I didn't think it worked with X4 (or rather, hadn't been updated yet). X -configure is nice but not terribly polished. --- Daniel O'Connor software and network engineer for Genesis Software - http://www.gsoft.com.au "The nice thing about standards is that there are so many of them to choose from." -- Andrew Tanenbaum To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 11:20:15 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by hub.freebsd.org (Postfix) with ESMTP id 0DEC737B416 for ; Sat, 5 Jan 2002 11:20:09 -0800 (PST) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc51.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020105192008.NPKN288.rwcrmhc51.attbi.com@InterJet.elischer.org>; Sat, 5 Jan 2002 19:20:08 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id LAA34122; Sat, 5 Jan 2002 11:17:59 -0800 (PST) Date: Sat, 5 Jan 2002 11:17:57 -0800 (PST) From: Julian Elischer To: Alfred Perlstein Cc: arch@freebsd.org Subject: Re: freeing thread structures. In-Reply-To: <20020104210829.T82406@elvis.mu.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Fri, 4 Jan 2002, Alfred Perlstein wrote: > * Julian Elischer [020104 20:57] wrote: > > > > In mi_switch AFTER the switch to the new thread has been completed, > > (which is one possibility) the sched lock is held, so freeing any resource > > that wants to own a normal sleeping mutex would seem to be an instant > > panic. ANother possibility is to simply queue it for later freeing by > > another entity, but I REALLY would rather avoid having a reaper thread (or > > init) do it. > > Please just use a reaper thread for now, it will not stop you from > optimizing it later. Haven't we gone over this? Yes but there must be a good synchronous way of doing this.. An async method will always lead to a lag in teh number of threads actually available, as it will.... hmmmm I was going to say that the availability of threads will be limited because there will e actually free threads that have not yet been properly freed, by which I mean that they have not had their ucred's crfree()'d yet etc. However it just occurs to me that maybe if we ran short of threads, we are probably in a situation to free them and make them available to ourselves.. Possibly the answer is to have an async free-er PLUS the allocator can call the free-er if it looks like it's pool is empty. Maybe, on the other hand, just the allocator and a few other critical points may be enough on their own.. Say, the allocator, the entry point for the userland thread_exit(), and maybe some other place that will happen even if we are not doing those two things.. Probably in the upcall code.. The ready-to-free thread structureas don't have anything except for the ucred in them that really requires freeing before they are put into the thread cache. so it's not a lot of waste to let them gather up a bit... (They'd just be in the cache anyhow, unless it was full in which case the stackpages would be dissociated and freed) > > -Alfred > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 11:53:26 2002 Delivered-To: freebsd-arch@freebsd.org Received: from elvis.mu.org (elvis.mu.org [216.33.66.196]) by hub.freebsd.org (Postfix) with ESMTP id B0C4937B419 for ; Sat, 5 Jan 2002 11:53:23 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id 2586681D01; Sat, 5 Jan 2002 13:53:18 -0600 (CST) Date: Sat, 5 Jan 2002 13:53:18 -0600 From: Alfred Perlstein To: Julian Elischer Cc: arch@freebsd.org Subject: Re: freeing thread structures. Message-ID: <20020105135318.X82406@elvis.mu.org> References: <20020104210829.T82406@elvis.mu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: ; from julian@elischer.org on Sat, Jan 05, 2002 at 11:17:57AM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Julian Elischer [020105 13:20] wrote: > Is there anything about a reaper thread that won't work? So far it seems as if it would work but you're afraid of the effeciency issues. Trust me there will be more than just effeciency issues to iron out, mostly bugs and the userland side retooling of libc_r. Holding off on these patches is doing more harm than good. -Alfred To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 12:48:55 2002 Delivered-To: freebsd-arch@freebsd.org Received: from snipe.prod.itd.earthlink.net (snipe.mail.pas.earthlink.net [207.217.120.62]) by hub.freebsd.org (Postfix) with ESMTP id F366037B432 for ; Sat, 5 Jan 2002 12:48:32 -0800 (PST) Received: from pool0211.cvx21-bradley.dialup.earthlink.net ([209.179.192.211] helo=mindspring.com) by snipe.prod.itd.earthlink.net with esmtp (Exim 3.33 #1) id 16Mxjt-00013A-00; Sat, 05 Jan 2002 12:48:25 -0800 Message-ID: <3C376690.6B14328@mindspring.com> Date: Sat, 05 Jan 2002 12:48:16 -0800 From: Terry Lambert X-Mailer: Mozilla 4.7 [en]C-CCK-MCD {Sony} (Win98; U) X-Accept-Language: en MIME-Version: 1.0 To: Julian Elischer Cc: Alfred Perlstein , arch@freebsd.org Subject: Re: freeing thread structures. References: Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG Julian Elischer wrote: > Yes but there must be a good synchronous way of doing this.. > An async method will always lead to a lag in teh number of threads > actually available, as it will.... hmmmm > > I was going to say that the availability of threads will be > limited because there will e actually free threads that have not yet been > properly freed, by which I mean that they have not had their ucred's > crfree()'d yet etc. However it just occurs to me that > maybe if we ran short of threads, we are probably in a situation to > free them and make them available to ourselves.. Uh, if the problem is creds, under what circumstances in POSIX is it permissable to have multiple threads in a process with different credentials? In other words, aren't you asking the wrong question here? I think the question is not "how can I free creds in thread_exit?", it's "why am I storing creds per thread in the first place?". > The ready-to-free thread structureas don't have anything except for > the ucred in them that really requires freeing before they are put into > the thread cache. so it's not a lot of waste to let them gather up a > bit... (They'd just be in the cache anyhow, unless it was full > in which case the stackpages would be dissociated and freed) Yes, exactly. And referencing the cred via the thread doesn't make sense anyway (IMO)... -- Terry To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 12:55:45 2002 Delivered-To: freebsd-arch@freebsd.org Received: from elvis.mu.org (elvis.mu.org [216.33.66.196]) by hub.freebsd.org (Postfix) with ESMTP id 8022C37B41A for ; Sat, 5 Jan 2002 12:55:40 -0800 (PST) Received: by elvis.mu.org (Postfix, from userid 1192) id 0825881D01; Sat, 5 Jan 2002 14:55:35 -0600 (CST) Date: Sat, 5 Jan 2002 14:55:35 -0600 From: Alfred Perlstein To: Terry Lambert Cc: arch@freebsd.org Subject: Re: freeing thread structures. Message-ID: <20020105145534.Y82406@elvis.mu.org> References: <3C376690.6B14328@mindspring.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5i In-Reply-To: <3C376690.6B14328@mindspring.com>; from tlambert2@mindspring.com on Sat, Jan 05, 2002 at 12:48:16PM -0800 Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG * Terry Lambert [020105 14:48] wrote: > Julian Elischer wrote: > > Yes but there must be a good synchronous way of doing this.. > > An async method will always lead to a lag in teh number of threads > > actually available, as it will.... hmmmm > > > > I was going to say that the availability of threads will be > > limited because there will e actually free threads that have not yet been > > properly freed, by which I mean that they have not had their ucred's > > crfree()'d yet etc. However it just occurs to me that > > maybe if we ran short of threads, we are probably in a situation to > > free them and make them available to ourselves.. > > Uh, if the problem is creds, under what circumstances in POSIX > is it permissable to have multiple threads in a process with > different credentials? > > In other words, aren't you asking the wrong question here? I > think the question is not "how can I free creds in thread_exit?", > it's "why am I storing creds per thread in the first place?". > > > The ready-to-free thread structureas don't have anything except for > > the ucred in them that really requires freeing before they are put into > > the thread cache. so it's not a lot of waste to let them gather up a > > bit... (They'd just be in the cache anyhow, unless it was full > > in which case the stackpages would be dissociated and freed) > > Yes, exactly. And referencing the cred via the thread doesn't > make sense anyway (IMO)... Well... Actually I always thought it was suboptimal the way one had to setuid back and forth when running as another user, if one could "setthreaduid()" it would allow somewhat effecient multiplexing of sperate credentials through an authentication server. This of ftpd without needing to fork() to handle non-anonymous connections. I'm sure there's an even smarter way, but it's just something that that I thought might be cool. -- -Alfred Perlstein [alfred@freebsd.org] 'Instead of asking why a piece of software is using "1970s technology," start asking why software is ignoring 30 years of accumulated wisdom.' Tax deductable donations for FreeBSD: http://www.freebsdfoundation.org/ To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 18:21: 5 2002 Delivered-To: freebsd-arch@freebsd.org Received: from fledge.watson.org (fledge.watson.org [204.156.12.50]) by hub.freebsd.org (Postfix) with ESMTP id A745937B400 for ; Sat, 5 Jan 2002 18:21:02 -0800 (PST) Received: from fledge.watson.org (robert@fledge.pr.watson.org [192.0.2.3]) by fledge.watson.org (8.11.6/8.11.5) with SMTP id g062KqD85843; Sat, 5 Jan 2002 21:20:52 -0500 (EST) (envelope-from robert@fledge.watson.org) Date: Sat, 5 Jan 2002 21:20:52 -0500 (EST) From: Robert Watson X-Sender: robert@fledge.watson.org To: Alfred Perlstein Cc: Terry Lambert , arch@freebsd.org Subject: Re: freeing thread structures. In-Reply-To: <20020105145534.Y82406@elvis.mu.org> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG On Sat, 5 Jan 2002, Alfred Perlstein wrote: > * Terry Lambert [020105 14:48] wrote: > Actually I always thought it was suboptimal the way one had to setuid > back and forth when running as another user, if one could > "setthreaduid()" it would allow somewhat effecient multiplexing of > sperate credentials through an authentication server. > > This of ftpd without needing to fork() to handle non-anonymous > connections. > > I'm sure there's an even smarter way, but it's just something that that > I thought might be cool. I used to think that idea was cool, and now I just think it's evil. The reason to have per-thread credentials in kernel is to reduce the requirement for locking and to allow consistent use of credentials while in kernel. Exposing multiple credentials in userland will introduce a world of unnecessary suffering. And I think it's probably fairly straight forward to argue that POSIX doesn't like it either, since POSIX appears to consider credentials the attribute of a process, not a thread. The risks here are great also: consider cases where a process downgrades its credentials to act on behalf of a user: normally, by-process events such as signals are limited due to the use of the per-process P_SUGID flag. This prevents the downgrade of credentials from leaking privilege in undersirable ways. Finally, look at the case where multiple credentials are supported: if you have any model where kernel schedulable entities don't map into user threads, your credential state now becomes part of the context you have to save/restore at each userland context switch. Also, threaded applications would behave differently using different threading libraries (userland vs. kse ...). Now suppose we implement AIO by using additional threads in userland... much the same will apply. This type of change will substantially change the model, quite possibly violate POSIX, not to mention user expectations, and make it *much* harder to implement things like Mandatory Access Control. Count my vote as a firm vote not to walk down this path of great evil. :-) It's cool, and you could do cute things like neato userland NFS servers, but the opportunities for suffering are excessive. Robert N M Watson FreeBSD Core Team, TrustedBSD Project robert@fledge.watson.org NAI Labs, Safeport Network Services To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 20:40:12 2002 Delivered-To: freebsd-arch@freebsd.org Received: from rwcrmhc51.attbi.com (rwcrmhc51.attbi.com [204.127.198.38]) by hub.freebsd.org (Postfix) with ESMTP id 9738237B419 for ; Sat, 5 Jan 2002 20:40:08 -0800 (PST) Received: from InterJet.elischer.org ([12.232.206.8]) by rwcrmhc51.attbi.com (InterMail vM.4.01.03.27 201-229-121-127-20010626) with ESMTP id <20020106044008.KTH288.rwcrmhc51.attbi.com@InterJet.elischer.org>; Sun, 6 Jan 2002 04:40:08 +0000 Received: from localhost (localhost.elischer.org [127.0.0.1]) by InterJet.elischer.org (8.9.1a/8.9.1) with ESMTP id UAA35866; Sat, 5 Jan 2002 20:30:21 -0800 (PST) Date: Sat, 5 Jan 2002 20:30:20 -0800 (PST) From: Julian Elischer To: Terry Lambert Cc: Alfred Perlstein , arch@freebsd.org Subject: Re: freeing thread structures. In-Reply-To: <3C376690.6B14328@mindspring.com> Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG You are "partially" confused. On Sat, 5 Jan 2002, Terry Lambert wrote: > Julian Elischer wrote: > > Yes but there must be a good synchronous way of doing this.. > > An async method will always lead to a lag in teh number of threads > > actually available, as it will.... hmmmm > > > > I was going to say that the availability of threads will be > > limited because there will e actually free threads that have not yet been > > properly freed, by which I mean that they have not had their ucred's > > crfree()'d yet etc. However it just occurs to me that > > maybe if we ran short of threads, we are probably in a situation to > > free them and make them available to ourselves.. > > Uh, if the problem is creds, under what circumstances in POSIX > is it permissable to have multiple threads in a process with > different credentials? we don't do this.. see later. > > In other words, aren't you asking the wrong question here? I > think the question is not "how can I free creds in thread_exit?", > it's "why am I storing creds per thread in the first place?". see later > > > The ready-to-free thread structureas don't have anything except for > > the ucred in them that really requires freeing before they are put into > > the thread cache. so it's not a lot of waste to let them gather up a > > bit... (They'd just be in the cache anyhow, unless it was full > > in which case the stackpages would be dissociated and freed) > > Yes, exactly. And referencing the cred via the thread doesn't > make sense anyway (IMO)... Remember that threads are short lived in the kernel. They exist only as long as a user thread 'dips' into the kernel, and are reaped when the control passes back into userland. This is because a single userland may spawn a virtually unlimmitted number of syscalls as they are now asynchronous. Now, if they all use the same credential, (which is shared between processes that have not moved to change them, (with a lazily evaluated copy-on-write semantic)) then if one syscall changes the cred at exactly the instant that another uses it, the second thread can reference an inconsitant cred, leaving room in the future for some sort of hack. I the scheme we have now, each thread, as it is assigned to a process, takes a reference to the processes ucred as it starts. That ucred will never change. when it is reaped teh reference is dropped again. If in the mean while another thread has applied to change the processes creds, then it will allocate a NEW ucred with a reference of 1, unreference the old one and substitute in the new one. All threads continue to work with the ucred that was in effect when they started their dip into the kernel. When they complete they return control to teh userland scheduler and do a thread_exit() which will decrement the ucred reference. If the process changed ucreds in the mean while it is concievable that the reference count may go to 0 and it be freed. The rule is that ucred is read-only. and thus requires no lock to access it from any thread that has it. it is the credential that was in effect when the syscall was started. There is NO syscall to change the ucred of a running thread. so Albert's idea is not a goer at this time :-) > > -- Terry > > To Unsubscribe: send mail to majordomo@FreeBSD.org > with "unsubscribe freebsd-arch" in the body of the message > To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 21:36:39 2002 Delivered-To: freebsd-arch@freebsd.org Received: from pcnet1.pcnet.com (pcnet1.pcnet.com [204.213.232.3]) by hub.freebsd.org (Postfix) with ESMTP id D494837B402 for ; Sat, 5 Jan 2002 21:36:37 -0800 (PST) Received: from vigrid.com (pm3-pt5.pcnet.net [206.105.29.79]) by pcnet1.pcnet.com (8.12.1/8.12.1) with ESMTP id g065ZUrU017702; Sun, 6 Jan 2002 00:35:31 -0500 (EST) Message-ID: <3C37E559.B011DF29@vigrid.com> Date: Sun, 06 Jan 2002 00:49:13 -0500 From: Dan Eischen X-Mailer: Mozilla 4.74 [en] (X11; U; FreeBSD 5.0-CURRENT i386) X-Accept-Language: en MIME-Version: 1.0 To: arch@freebsd.org Subject: Request for review: getcontext, setcontext, etc Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG I've got getcontext, setcontext, makecontext, and swapcontext implemented and would like to add them to libc. I've made the patch available at: http://people.freebsd.org/~deischen/ucontext/uc-libc-sys.diffs These are library level versions of these functions. Solaris has getcontext and setcontext as system calls. My intent is to replace libc_r's use of setjmp/longjmp and jmp_buf munging with these functions, so I don't want them as system calls. I also want to use them with the KSE-enabled threads library. Is there a reason that getcontext and setcontext need to be system calls? For those not familiar with these functions, see: http://www.opengroup.org/onlinepubs/007908799/xsh/ucontext.h.html -- Dan Eischen To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message From owner-freebsd-arch Sat Jan 5 23:41:55 2002 Delivered-To: freebsd-arch@freebsd.org Received: from netian.com (s210-117-109-162.thrunet.ne.kr [210.117.109.162]) by hub.freebsd.org (Postfix) with SMTP id 809AC37B404 for ; Sat, 5 Jan 2002 23:41:45 -0800 (PST) Reply-To: kpyh1224@netian.com From: ƯÆÇ»ç¾÷ºÎ To: Subject: °ú¼Ó°¨½ÃÄ«¸Þ¶ó À§Ä¡¸¦ À½¼ºÀ¸·Î ¾È³»[±¤°í¹°] Mime-Version: 1.0 Content-Type: text/html; charset="ks_c_5601-1987" Date: Sun, 6 Jan 2002 16:49:02 +0900 X-User: 2.11-dgegcjgt-coehdi-Bbgff Message-Id: <20020106074145.809AC37B404@hub.freebsd.org> Sender: owner-freebsd-arch@FreeBSD.ORG Precedence: bulk List-ID: List-Archive: (Web Archive) List-Help: (List Instructions) List-Subscribe: List-Unsubscribe: X-Loop: FreeBSD.ORG [±¤°í¸ÞÀÏ]

Àü±¹µµ·Î¸¦ ½Ç»çÇÏ¿© ±¸ÃàµÈ DB¸¦ ÀÌ¿ëÇÏ¿© Àü±¹ÅäÀÇ 70%°¡ »ê¾ÇÁö´ëÀÎ ¿ì¸®³ª¶ó µµ·Î ¿©°Ç¿¡¼­ »ç°í À§ÇèÁö¿ª°ú °ú¼Ó À§ÇèÁö¿ª Àü¹æ 500m ÁöÁ¡¿¡¼­ »çÀü À½¼º¾È³»¸¦ ÇØÁÝ´Ï´Ù.

¹®ÀÇÀüÈ­ 080-439-5004, ´ã´ç ƯÆÇ»ç¾÷ºÎ / ¼öȣõ»ç ¼ÒºñÀÚ°¡°Ý 280,000¿ø

 ¼ö½Å°ÅºÎÀǻ簡 ÀÖÀ¸½Ã¸é ¿©±â¸¦ Ŭ¸¯ÇÏ¿© ÁÖ½Ã¸é °¨»çÇÏ°Ú½À´Ï´Ù. To Unsubscribe: send mail to majordomo@FreeBSD.org with "unsubscribe freebsd-arch" in the body of the message