From owner-freebsd-arch@FreeBSD.ORG Mon Mar 12 16:16:24 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 54AA916A40B for ; Mon, 12 Mar 2007 16:16:24 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from server.baldwin.cx (66-23-211-162.clients.speedfactory.net [66.23.211.162]) by mx1.freebsd.org (Postfix) with ESMTP id E095213C448 for ; Mon, 12 Mar 2007 16:16:23 +0000 (UTC) (envelope-from jhb@freebsd.org) Received: from mutex.atlanta.corp.yahoo.com (nat-outside.atlanta.corp.yahoo.com [63.172.193.57]) (authenticated bits=0) by server.baldwin.cx (8.13.8/8.13.8) with ESMTP id l2CGG494062817; Mon, 12 Mar 2007 11:16:05 -0500 (EST) (envelope-from jhb@freebsd.org) From: John Baldwin To: LI Xin Date: Mon, 12 Mar 2007 10:17:25 -0400 User-Agent: KMail/1.9.1 References: <45F2C2CB.5000204@delphij.net> In-Reply-To: <45F2C2CB.5000204@delphij.net> MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-15" Content-Transfer-Encoding: 7bit Content-Disposition: inline Message-Id: <200703121017.25782.jhb@freebsd.org> X-Greylist: Sender succeeded SMTP AUTH authentication, not delayed by milter-greylist-2.0.2 (server.baldwin.cx [66.23.211.162]); Mon, 12 Mar 2007 11:16:05 -0500 (EST) X-Virus-Scanned: ClamAV 0.88.3/2823/Mon Mar 12 04:55:20 2007 on server.baldwin.cx X-Virus-Status: Clean X-Spam-Status: No, score=-3.5 required=4.2 tests=AWL,BAYES_00 autolearn=ham version=3.1.3 X-Spam-Checker-Version: SpamAssassin 3.1.3 (2006-06-01) on server.baldwin.cx Cc: MingyanGuo , freebsd-arch@freebsd.org Subject: Re: locking reasoning within fork1() X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 12 Mar 2007 16:16:24 -0000 On Saturday 10 March 2007 09:38, LI Xin wrote: > Hi, > > During the AsiaBSDCon DevSummit we have go through the current KSE and > some userland threading code, and I think that brings me back to the > fork1() vs others races. > > The current logic, especially the locking order found in fork1() looks > not very ideal according to my read. I have pursued some code from > other BSDs, and I think we might want to address the following problems: > > - At which point we should consider that a process really exists? At > this point, there is no clear point that we can call a process as > "really born". It looks to me that PRS_NEW just indicate that a process > is not "fully initialized", but it does not provide information about > "how much initialization did we done". This would make several > operation very questionable, and is more error-prone. As Guo (cc'ed) > pointed out, there are chances that kill(0, ..) and kill(-1, ..) would > not cover PRS_NEW processes, there might be also some other places where > should take care of. This is why I had advocated using a sleep so that consumers either ignore PRS_NEW processes or wait until they are completely initalized and PRS_NORMAL. > - The locking scheme does not look pretty. We grab and release locks > again and again, and it might be more optimal to collapse some work > together, and re-consider synchornization with other parts of the kernel. To a large extent this reorganization has already been done where possible. > - Certain parts of struct proc is mostly not accessed frequently. For > the sake of better exploit of cache, we may want to consider to move > certain parts out from the struct. You mean to the bottom of the struct maybe? I'm not sure the overhead of having separately allocated structures and extra pointer indirections will do anything but hurt. > - The PID allocation is somewhat expensive when there are a lot of > processes. This might not be a very big deal, though, but given that it > requires to hold a sx_xlock, our scalability could be limited due to > this. tjr@ has a proposed hash based PID allocation patch in his p4 > branch, and NetBSD have an O(1) algorithm that may worth to have a look at. This has been brought up before, and when tjr's stuff was tested it didn't help IIRC. Part of the issue here is that pid space is not just a simple walk of processes, but also of process groups and sessions. If you didn't want to walk all the data structures you'd have to have some sort of PID reference counting for the 3 possible references on a pid: process, pgrp, and session. -- John Baldwin From owner-freebsd-arch@FreeBSD.ORG Tue Mar 13 09:01:56 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DE10316A405 for ; Tue, 13 Mar 2007 09:01:56 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id AB8C813C4AE for ; Tue, 13 Mar 2007 09:01:56 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 2796D170DC for ; Tue, 13 Mar 2007 09:01:55 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id l2D91sBT039969 for ; Tue, 13 Mar 2007 09:01:54 GMT (envelope-from phk@critter.freebsd.dk) To: arch@freebsd.org From: Poul-Henning Kamp Date: Tue, 13 Mar 2007 09:01:54 +0000 Message-ID: <39968.1173776514@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: Subject: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2007 09:01:56 -0000 It has always bothered me that some of the TAILQ macros need to know the struct name of the header type. Many times where I could have avoided naming that structure, I had to, only to satisfy this weird craving of . needs the struct name TAILQ_LAST and TAILQ_PREV casts, what is potentially a pointer to an entry, to the header type, before it dereferences the backwards pointer. The validity of this cast depends critically on the layout of the entry struct and the header struct being identical with respect to the two pointers. So, if we have already assumed, that they have the same layout, why do the cast in the first place ? The following proof of concept patch shows, that you can implement these two macros without the cast to the header struct type. If we did this, we could eliminate the "headname" argument to TAILQ_FOREACH_REVERSE TAILQ_FOREACH_REVERS_SAFE TAILQ_LAST TAILQ_PREV Obviously this is bikeshed fodder, but given how big a help is programming-wise, and given that those four macros are comparatively seldomly used, I will propose to remove this wart from under the banner of computer science in general and suffer the minor backwards compatibility issues it will cause. Poul-Henning Index: queue.h =================================================================== RCS file: /home/ncvs/src/sys/sys/queue.h,v retrieving revision 1.68 diff -u -r1.68 queue.h --- queue.h 24 Oct 2006 11:20:29 -0000 1.68 +++ queue.h 13 Mar 2007 08:51:51 -0000 @@ -546,12 +546,12 @@ } while (0) #define TAILQ_LAST(head, headname) \ - (*(((struct headname *)((head)->tqh_last))->tqh_last)) + (*((head)->tqh_last->tqe_prev) #define TAILQ_NEXT(elm, field) ((elm)->field.tqe_next) #define TAILQ_PREV(elm, headname, field) \ - (*(((struct headname *)((elm)->field.tqe_prev))->tqh_last)) + (*((elm)->field.tqe_prev->tqe_prev)) #define TAILQ_REMOVE(head, elm, field) do { \ QMD_TAILQ_CHECK_NEXT(elm, field); \ -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Mar 13 09:38:08 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 002F316A408 for ; Tue, 13 Mar 2007 09:38:07 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id C0C7613C4B0 for ; Tue, 13 Mar 2007 09:38:07 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 4DB5A170DC for ; Tue, 13 Mar 2007 09:38:06 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id l2D9c5Wv040442 for ; Tue, 13 Mar 2007 09:38:05 GMT (envelope-from phk@critter.freebsd.dk) From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 13 Mar 2007 09:01:54 GMT." <39968.1173776514@critter.freebsd.dk> Date: Tue, 13 Mar 2007 09:38:05 +0000 Message-ID: <40441.1173778685@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2007 09:38:08 -0000 In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp writes: > >It has always bothered me that some of the TAILQ macros need to >know the struct name of the header type. And it still does, my patch was bogus, I was sleepy, forget it. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Mar 13 09:45:35 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0642C16A400 for ; Tue, 13 Mar 2007 09:45:35 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (tim.des.no [194.63.250.121]) by mx1.freebsd.org (Postfix) with ESMTP id C4B3B13C455 for ; Tue, 13 Mar 2007 09:45:34 +0000 (UTC) (envelope-from des@des.no) Received: from tim.des.no (localhost [127.0.0.1]) by spam.des.no (Postfix) with ESMTP id BD1472085; Tue, 13 Mar 2007 10:21:26 +0100 (CET) X-Spam-Tests: AWL X-Spam-Learn: disabled X-Spam-Score: 0.0/3.0 X-Spam-Checker-Version: SpamAssassin 3.1.7 (2006-10-05) on tim.des.no Received: from dwp.des.no (des.no [80.203.243.180]) by tim.des.no (Postfix) with ESMTP id AC46A207E; Tue, 13 Mar 2007 10:21:26 +0100 (CET) Received: by dwp.des.no (Postfix, from userid 1001) id 25376B88E; Tue, 13 Mar 2007 10:21:26 +0100 (CET) From: des@des.no (Dag-Erling =?iso-8859-1?Q?Sm=F8rgrav?=) To: Poul-Henning Kamp References: <39968.1173776514@critter.freebsd.dk> Date: Tue, 13 Mar 2007 10:21:26 +0100 In-Reply-To: <39968.1173776514@critter.freebsd.dk> (Poul-Henning Kamp's message of "Tue, 13 Mar 2007 09:01:54 +0000") Message-ID: <864pop8pnt.fsf@dwp.des.no> User-Agent: Gnus/5.110006 (No Gnus v0.6) Emacs/21.3 (berkeley-unix) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: quoted-printable Cc: arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2007 09:45:35 -0000 Poul-Henning Kamp writes: > It has always bothered me that some of the TAILQ macros need to > know the struct name of the header type. > [...] > Obviously this is bikeshed fodder, but given how big a help > is programming-wise, and given that those four macros are comparatively > seldomly used, I will propose to remove this wart from > under the banner of computer science in general and suffer the minor > backwards compatibility issues it will cause. The backwards compatibility issues may be minor, but the compatibility-with-other-BSDs issues aren't... DES --=20 Dag-Erling Sm=F8rgrav - des@des.no From owner-freebsd-arch@FreeBSD.ORG Tue Mar 13 10:01:46 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8AFA716A401 for ; Tue, 13 Mar 2007 10:01:46 +0000 (UTC) (envelope-from xdivac02@stud.fit.vutbr.cz) Received: from eva.fit.vutbr.cz (eva.fit.vutbr.cz [147.229.176.14]) by mx1.freebsd.org (Postfix) with ESMTP id 2715E13C45A for ; Tue, 13 Mar 2007 10:01:45 +0000 (UTC) (envelope-from xdivac02@stud.fit.vutbr.cz) Received: from eva.fit.vutbr.cz (localhost [127.0.0.1]) by eva.fit.vutbr.cz (envelope-from xdivac02@eva.fit.vutbr.cz) (8.13.8/8.13.7) with ESMTP id l2D9j5n1041641 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 13 Mar 2007 10:45:05 +0100 (CET) Received: (from xdivac02@localhost) by eva.fit.vutbr.cz (8.13.8/8.13.3/Submit) id l2D9j5be041639; Tue, 13 Mar 2007 10:45:05 +0100 (CET) Date: Tue, 13 Mar 2007 10:45:05 +0100 From: Divacky Roman To: LI Xin Message-ID: <20070313094505.GA40816@stud.fit.vutbr.cz> References: <45F2C2CB.5000204@delphij.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45F2C2CB.5000204@delphij.net> User-Agent: Mutt/1.4.2.2i X-Scanned-By: MIMEDefang 2.57 on 147.229.176.14 Cc: MingyanGuo , freebsd-arch@freebsd.org Subject: Re: locking reasoning within fork1() X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2007 10:01:46 -0000 > - The PID allocation is somewhat expensive when there are a lot of > processes. This might not be a very big deal, though, but given that it > requires to hold a sx_xlock, our scalability could be limited due to > this. tjr@ has a proposed hash based PID allocation patch in his p4 > branch, and NetBSD have an O(1) algorithm that may worth to have a look at. beside that.. what is RFHIGHPID good for? Its supposed to allocate pid > 10 which comment says its used during boot. But it only seems to be used for some kthreads (not all). No other BSD seems to use this... can we safely remove that? also.. would __predict_true/__predict_false usage help anything? netbsd seems to use it in their fork1() routine. especially the pid allocation could benefit from this.. comments? roman From owner-freebsd-arch@FreeBSD.ORG Tue Mar 13 18:52:20 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 75AEA16A403 for ; Tue, 13 Mar 2007 18:52:20 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outT.internet-mail-service.net (outT.internet-mail-service.net [216.240.47.243]) by mx1.freebsd.org (Postfix) with ESMTP id 4D5DC13C46A for ; Tue, 13 Mar 2007 18:52:20 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.32) with ESMTP; Tue, 13 Mar 2007 11:25:20 -0700 Received: from [192.168.2.5] (home.elischer.org [216.240.48.38]) by idiom.com (Postfix) with ESMTP id 7683C1260C2; Tue, 13 Mar 2007 11:43:05 -0700 (PDT) Message-ID: <45F6F0B9.8080702@elischer.org> Date: Tue, 13 Mar 2007 11:43:05 -0700 From: Julian Elischer User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: =?ISO-8859-1?Q?Dag-Erling_Sm=F8rgrav?= References: <39968.1173776514@critter.freebsd.dk> <864pop8pnt.fsf@dwp.des.no> In-Reply-To: <864pop8pnt.fsf@dwp.des.no> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 8bit Cc: arch@freebsd.org, Poul-Henning Kamp Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 13 Mar 2007 18:52:20 -0000 Dag-Erling Smørgrav wrote: > Poul-Henning Kamp writes: >> It has always bothered me that some of the TAILQ macros need to >> know the struct name of the header type. >> [...] >> Obviously this is bikeshed fodder, but given how big a help >> is programming-wise, and given that those four macros are comparatively >> seldomly used, I will propose to remove this wart from >> under the banner of computer science in general and suffer the minor >> backwards compatibility issues it will cause. > > The backwards compatibility issues may be minor, but the > compatibility-with-other-BSDs issues aren't... > > DES please discuss this with Kirk as my memory is that he had a hand in the original implementation (or at least knew the story). From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 11:54:23 2007 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1379B16A405 for ; Thu, 15 Mar 2007 11:54:23 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (comp.chem.msu.su [158.250.32.97]) by mx1.freebsd.org (Postfix) with ESMTP id 925F513C459 for ; Thu, 15 Mar 2007 11:54:19 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.13.4/8.13.4) with ESMTP id l2FBE1Ps030670; Thu, 15 Mar 2007 14:14:01 +0300 (MSK) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.13.4/8.13.4/Submit) id l2FBE11w030669; Thu, 15 Mar 2007 14:14:01 +0300 (MSK) (envelope-from yar) Date: Thu, 15 Mar 2007 14:14:00 +0300 From: Yar Tikhiy To: "Bruce M. Simpson" Message-ID: <20070315111400.GC28354@comp.chem.msu.su> Mail-Followup-To: Yar Tikhiy , "Bruce M. Simpson" , freebsd-arch@FreeBSD.org References: <20070314102023.GB1766@comp.chem.msu.su> <45F7EF84.5070700@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45F7EF84.5070700@FreeBSD.org> User-Agent: Mutt/1.5.9i Cc: freebsd-net@FreeBSD.org, freebsd-arch@FreeBSD.org Subject: Re: Generic ioctl and ether_ioctl don't agree X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 11:54:23 -0000 On Wed, Mar 14, 2007 at 12:50:12PM +0000, Bruce M. Simpson wrote: > Yar Tikhiy wrote: > >Hi folks, > > > >Quite a while ago I noticed that our ioctl handlers get the ioctl > >command via u_long, but ether_ioctl()'s command argument is int. > >This disarray dates back to 1998, when ioctl functions started to > >take u_long as the command, but ether_ioctl() was never fixed. > >Fortunately, our ioctl command coding still fits in 32 bits, or > >else we would've got problems on 64-bit arch'es already. I'd like > >to fix this long-standing bug some day after RELENG_7 is branched. > >Of course, this will break ABI to network modules on all 64-bit > >arch'es. BTW, the same applies to other L2 layers, such as firewire, > >which seems to have been cloned from if_ethersubr.c. > > > This is one of those annoying things which breaks compatibility with > external modules. > > I'm not sure about this, though. I was getting sign extension warnings > on amd64 last week when I was testing the IGMPv3 aware mtest(8). Perhaps > if we're fixing these ABIs, we should commit to an explicit C99 type > with known bit width, i.e. uint32_t. > > I would be much happier if we began using C99 types in the code. This is a point to discuss in -arch as it's closely related to the generic ioctl interface. Let's move this thread to -arch. It's been a vague issue to me whether to use a fixed-width type or a basic type in particular kernel code. Of course, it's better to use a fixed-width type when it comes to network packets or hardware registers. OTOH, errno is int. But not all cases are that simple. Do we have a guideline for that? -- Yar From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 12:12:13 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 76E6416A402; Thu, 15 Mar 2007 12:12:13 +0000 (UTC) (envelope-from dds@aueb.gr) Received: from mx-out.forthnet.gr (mx-out.forthnet.gr [193.92.150.103]) by mx1.freebsd.org (Postfix) with ESMTP id E6F5213C44B; Thu, 15 Mar 2007 12:12:12 +0000 (UTC) (envelope-from dds@aueb.gr) Received: from mx-av-03.forthnet.gr (mx-av.forthnet.gr [193.92.150.27]) by mx-out-02.forthnet.gr (8.14.0/8.14.0) with ESMTP id l2F8gYKC012269; Thu, 15 Mar 2007 10:42:34 +0200 Received: from MX-IN-01.forthnet.gr (mx-in-01.forthnet.gr [193.92.150.23]) by mx-av-03.forthnet.gr (8.13.7/8.13.7) with ESMTP id l2F8gUEr031596; Thu, 15 Mar 2007 10:42:34 +0200 Received: from [192.168.136.22] (ppp97-184.adsl.forthnet.gr [212.54.207.184]) by MX-IN-01.forthnet.gr (8.14.0/8.14.0) with ESMTP id l2F8gPi8003629; Thu, 15 Mar 2007 10:42:25 +0200 Authentication-Results: MX-IN-01.forthnet.gr from=dds@aueb.gr; sender-id=neutral; spf=neutral Message-ID: <45F906ED.8070100@aueb.gr> Date: Thu, 15 Mar 2007 10:42:21 +0200 From: Diomidis Spinellis User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.9) Gecko/20061211 SeaMonkey/1.0.7 MIME-Version: 1.0 To: freebsd-arch@freebsd.org, freebsd-threads@freebsd.org Content-Type: text/plain; charset=ISO-8859-7; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 12:12:13 -0000 Greetings, I've added multhread support to our qsort implementation. You can see the code at and some performance figures at the end of this email. I propose to add it to our libc. I need your opinions on the interface, and also any comments you may have on the code. I see the following design dimensions: 1. Naming and availability -------------------------- - Option 1.1: Replace qsort with this implementation * Pros: Programs gain performance without any source code change; abstraction (number of processors/cores should be invisible, just as the CPU architecture or disk drive interface) * Cons: May confuse multithreaded programs; programs require linking with pthreads library; programs whose compare function is not thread safe will break - Option 1.2: Name it qsort_mt * Pros: POLA; programs can tune the call according to their need * Cons: Programs require source code changes; we violate abstraction; namespace polution My proposal: option 1.2: name it qsort_mt and make it available in a separate library (libc_mt). 2. Interface ------------ - Option 2.1: qsort compatible * Pros: Drop in replacement; doesn't need programmer understanding; compatible with option 1.1 * Cons: Can't be tuned for a specific program or workload - Option 2.2: Add nthreads parameter, specifying the maximum number of threads * Pros: Multithreaded programs can tune load balancing; all programs can optimise nthreads according to the workload characteristics. * Cons: Libc routines are generaly expected to Do he Right Thing, without handholding; must agree on mechanism for determining the number of threads; the benefits of tuning are unlikely to be dramatic. - Option 2.3: Add forkelem, specifying minimum number of elements for a new thread (below forkelem, I use single-threaded qsort). * Pros: programs can optimise nthreads according to the workload characteristics. * Cons: Libc routines are generaly expected to Do he Right Thing, without handholding; the benefits of tuning are unlikely to be dramatic. - Option 2.4: Have the library tune itsself for a given system or individual programs by measuring different values * Pros: higher performance when the tuned values are reused * Cons: Lack of prior art; performance drop when determining tuning values; security considerations if values are cached; complexity (Options 2.2 and 2.3 are orthogonal; options 2.1 and 2.4 are orthogonal) My proposal: go for 2.1, based on the way C library routines work. 3. Determining the number of threads ------------------------------------ - Option 3.1: NPROCS environment variable * Pros: prior art: CrayOS, BSD/OS; users can tune it. * Cons: May be too blunt; requires setting by user or administrator - Option 3.[234]: hw.ncpu or kern.smp.cpus sysctl, or pmc_ncpu() * Pros: automatic, right for the underlying hardware * Cons: even blunter - Option 3.5: Add library interface for the above as int nthreads_mt() * Pros: we abstract policy; programs can replace it; reusability; extensibility. * Cons: namespace polution; it is not clear that a single number is correct for all uses of a program. (All the above options are orthogonal) My proposal: Add library interface using NPROCS and falling back to a sysctl variable. This interface can be extended in the future. Performance figures ------------------- Reported times are elapsed, user, and system seconds. $ uname -a FreeBSD sledge.freebsd.org 7.0-CURRENT FreeBSD 7.0-CURRENT #924: Wed Mar 14 14:25:07 UTC 2007 root@sledge.freebsd.org:/h/src/sys/amd64/compile/SLEDGE amd64 $ qsort_mt -? qsort_mt: option requires an argument -- h usage: qsort_mt [-stv] [-f forkelements] [-h threads] [-n elements] -l Run the libc version of qsort -s Test with 20-byte strings, instead of integers -t Print timing results -v Verify the integer results Defaults are 1e7 elements, 2 threads, 100 fork elements $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lthr -lpmc $ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3) 46.5 46.2 0.314 $ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt 27.5 46.3 0.301 $ qsort_mt -t -h 2 -n 10000000 -s -l # Strings; qsort(3) 41.3 40.1 1.2 $ ./qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt 26.7 40.1 1.16 $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lpthread -lpmc $ qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt 28 47.1 0.368 $ qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt 27.1 40.6 1.06 Thanks, Diomidis - dds@ http://www.spinellis.gr From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 12:26:32 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CF19C16A402 for ; Thu, 15 Mar 2007 12:26:32 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.freebsd.org (Postfix) with ESMTP id 8E30E13C459 for ; Thu, 15 Mar 2007 12:26:32 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1HRp2A-0007C6-Gn for freebsd-arch@freebsd.org; Thu, 15 Mar 2007 13:26:19 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 15 Mar 2007 13:26:18 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 15 Mar 2007 13:26:18 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Thu, 15 Mar 2007 13:26:03 +0100 Lines: 8 Message-ID: References: <45F906ED.8070100@aueb.gr> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 1.5.0.10 (X11/20060911) In-Reply-To: <45F906ED.8070100@aueb.gr> Sender: news Cc: freebsd-threads@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 12:26:32 -0000 Diomidis Spinellis wrote: > Greetings, > > I've added multhread support to our qsort implementation. You can see > the code at and some Interesting idea. I remember talks about OpenMP in gcc 4.2 - maybe it would be better to do it with OpenMP? From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 13:34:11 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 1E2E416A403; Thu, 15 Mar 2007 13:34:11 +0000 (UTC) (envelope-from mezz7@cox.net) Received: from eastrmmtai104.cox.net (eastrmmtai104.cox.net [68.230.240.11]) by mx1.freebsd.org (Postfix) with ESMTP id AFC5613C45A; Thu, 15 Mar 2007 13:34:10 +0000 (UTC) (envelope-from mezz7@cox.net) Received: from eastrmimpo02.cox.net ([68.1.16.120]) by eastrmmtao105.cox.net (InterMail vM.7.05.02.00 201-2174-114-20060621) with ESMTP id <20070315124215.FFRR16421.eastrmmtao105.cox.net@eastrmimpo02.cox.net>; Thu, 15 Mar 2007 08:42:15 -0400 Received: from mezz.mezzweb.com ([24.255.149.218]) by eastrmimpo02.cox.net with bizsmtp id b0iD1W0044iy4EG0000000; Thu, 15 Mar 2007 08:42:14 -0400 Date: Thu, 15 Mar 2007 07:44:03 -0500 To: "Diomidis Spinellis" From: "Jeremy Messenger" Content-Type: text/plain; format=flowed; delsp=yes; charset=us-ascii MIME-Version: 1.0 References: <45F906ED.8070100@aueb.gr> Content-Transfer-Encoding: 7bit Message-ID: In-Reply-To: <45F906ED.8070100@aueb.gr> User-Agent: Opera Mail/9.10 (Linux) Cc: freebsd-threads@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 13:34:11 -0000 On Thu, 15 Mar 2007 03:42:21 -0500, Diomidis Spinellis wrote: > Greetings, > > I've added multhread support to our qsort implementation. You can see > the code at and some > performance figures at the end of this email. I propose to add it to > our libc. I need your opinions on the interface, and also any comments > you may have on the code. I see the following design dimensions: That remind me other days when I have readed in one of blog that have some good info about sorts. I will posting a few of links in here in case if anyone is insteresting to read. ;-) Quick Sort (qsort()): http://jeffreystedfast.blogspot.com/2007/03/quick-sort.html 25 Byte Sort Routine: http://jeffreystedfast.blogspot.com/2007/03/25-byte-sort-routine.html Merge Sort: http://jeffreystedfast.blogspot.com/2007/02/merge-sort.html A lot of more sorts: Shell Sort Binary Insertion Sort Insertion Sort Bubble Sort http://jeffreystedfast.blogspot.com/search/label/sorting Cheers, Mezz > 1. Naming and availability > -------------------------- > - Option 1.1: Replace qsort with this implementation > * Pros: Programs gain performance without any source code change; > abstraction (number of processors/cores should be invisible, just as the > CPU architecture or disk drive interface) > * Cons: May confuse multithreaded programs; programs require linking > with pthreads library; programs whose compare function is not thread > safe will break > > - Option 1.2: Name it qsort_mt > * Pros: POLA; programs can tune the call according to their need > * Cons: Programs require source code changes; we violate abstraction; > namespace polution > > My proposal: option 1.2: name it qsort_mt and make it available in a > separate library (libc_mt). > > > 2. Interface > ------------ > - Option 2.1: qsort compatible > * Pros: Drop in replacement; doesn't need programmer understanding; > compatible with option 1.1 > * Cons: Can't be tuned for a specific program or workload > > - Option 2.2: Add nthreads parameter, specifying the maximum number of > threads > * Pros: Multithreaded programs can tune load balancing; all programs can > optimise nthreads according to the workload characteristics. > * Cons: Libc routines are generaly expected to Do he Right Thing, > without handholding; must agree on mechanism for determining the number > of threads; the benefits of tuning are unlikely to be dramatic. > > - Option 2.3: Add forkelem, specifying minimum number of elements for a > new thread (below forkelem, I use single-threaded qsort). > * Pros: programs can optimise nthreads according to the workload > characteristics. > * Cons: Libc routines are generaly expected to Do he Right Thing, > without handholding; the benefits of tuning are unlikely to be dramatic. > > - Option 2.4: Have the library tune itsself for a given system or > individual programs by measuring different values > * Pros: higher performance when the tuned values are reused > * Cons: Lack of prior art; performance drop when determining tuning > values; security considerations if values are cached; complexity > > (Options 2.2 and 2.3 are orthogonal; options 2.1 and 2.4 are orthogonal) > > My proposal: go for 2.1, based on the way C library routines work. > > 3. Determining the number of threads > ------------------------------------ > - Option 3.1: NPROCS environment variable > * Pros: prior art: CrayOS, BSD/OS; users can tune it. > * Cons: May be too blunt; requires setting by user or administrator > > - Option 3.[234]: hw.ncpu or kern.smp.cpus sysctl, or pmc_ncpu() > * Pros: automatic, right for the underlying hardware > * Cons: even blunter > > - Option 3.5: Add library interface for the above as int nthreads_mt() > * Pros: we abstract policy; programs can replace it; reusability; > extensibility. > * Cons: namespace polution; it is not clear that a single number is > correct for all uses of a program. > > (All the above options are orthogonal) > > My proposal: Add library interface using NPROCS and falling back to a > sysctl variable. This interface can be extended in the future. > > Performance figures > ------------------- > Reported times are elapsed, user, and system seconds. > > $ uname -a > FreeBSD sledge.freebsd.org 7.0-CURRENT FreeBSD 7.0-CURRENT #924: Wed Mar > 14 14:25:07 UTC 2007 > root@sledge.freebsd.org:/h/src/sys/amd64/compile/SLEDGE amd64 > > $ qsort_mt -? > qsort_mt: option requires an argument -- h > usage: qsort_mt [-stv] [-f forkelements] [-h threads] [-n elements] > -l Run the libc version of qsort > -s Test with 20-byte strings, instead of integers > -t Print timing results > -v Verify the integer results > Defaults are 1e7 elements, 2 threads, 100 fork elements > > > $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lthr -lpmc > > $ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3) > 46.5 46.2 0.314 > $ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt > 27.5 46.3 0.301 > $ qsort_mt -t -h 2 -n 10000000 -s -l # Strings; qsort(3) > 41.3 40.1 1.2 > $ ./qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt > 26.7 40.1 1.16 > > $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lpthread -lpmc > $ qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt > 28 47.1 0.368 > $ qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt > 27.1 40.6 1.06 > > > Thanks, > > Diomidis - dds@ > http://www.spinellis.gr -- mezz7@cox.net - mezz@FreeBSD.org FreeBSD GNOME Team - FreeBSD Multimedia Hat (ports, not src) http://www.FreeBSD.org/gnome/ - gnome@FreeBSD.org http://wiki.freebsd.org/multimedia - multimedia@FreeBSD.org From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 13:56:26 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id C98B716A469 for ; Thu, 15 Mar 2007 13:56:26 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (comp.chem.msu.su [158.250.32.97]) by mx1.freebsd.org (Postfix) with ESMTP id 34BC413C48A for ; Thu, 15 Mar 2007 13:56:25 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.13.4/8.13.4) with ESMTP id l2FDh5B5032428; Thu, 15 Mar 2007 16:43:05 +0300 (MSK) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.13.4/8.13.4/Submit) id l2FDh1NY032420; Thu, 15 Mar 2007 16:43:01 +0300 (MSK) (envelope-from yar) Date: Thu, 15 Mar 2007 16:43:00 +0300 From: Yar Tikhiy To: Poul-Henning Kamp Message-ID: <20070315134300.GE28354@comp.chem.msu.su> References: <39968.1173776514@critter.freebsd.dk> <40441.1173778685@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <40441.1173778685@critter.freebsd.dk> User-Agent: Mutt/1.5.9i Cc: arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 13:56:26 -0000 On Tue, Mar 13, 2007 at 09:38:05AM +0000, Poul-Henning Kamp wrote: > In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp writes: > > > >It has always bothered me that some of the TAILQ macros need to > >know the struct name of the header type. Yeah, can present a challenge in understanding an implementation of basic data structures and related algos. :-) You thought that tqe_prev points to the whole entry structure when making the patch, didn't you? Personally, I cannot explain to myself why in the double-linked structs the prev member points to the next member in the previous list element and not to the previous list element itself. Could anybody with CS education explain merits of the current approach? I can only see that now we have to go to the element before the previous one for a pointer to the latter. I'm not going to dispute the current way of things, just curious. -- Yar From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 14:05:27 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 8164316A400 for ; Thu, 15 Mar 2007 14:05:27 +0000 (UTC) (envelope-from wollman@khavrinen.csail.mit.edu) Received: from khavrinen.csail.mit.edu (khavrinen.csail.mit.edu [128.30.28.20]) by mx1.freebsd.org (Postfix) with ESMTP id 2FBC613C46E for ; Thu, 15 Mar 2007 14:05:27 +0000 (UTC) (envelope-from wollman@khavrinen.csail.mit.edu) Received: from khavrinen.csail.mit.edu (localhost.csail.mit.edu [127.0.0.1]) by khavrinen.csail.mit.edu (8.13.6/8.13.6) with ESMTP id l2FDYDLU024196 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK CN=khavrinen.csail.mit.edu issuer=Client+20CA); Thu, 15 Mar 2007 09:34:13 -0400 (EDT) (envelope-from wollman@khavrinen.csail.mit.edu) Received: (from wollman@localhost) by khavrinen.csail.mit.edu (8.13.6/8.13.6/Submit) id l2FDYAfo024194; Thu, 15 Mar 2007 09:34:10 -0400 (EDT) (envelope-from wollman) Date: Thu, 15 Mar 2007 09:34:10 -0400 (EDT) From: Garrett Wollman Message-Id: <200703151334.l2FDYAfo024194@khavrinen.csail.mit.edu> To: dds@aueb.gr In-Reply-To: <45F906ED.8070100@aueb.gr> Organization: MIT Computer Science & Artificial Intelligence Laboratory X-Greylist: Sender DNS name whitelisted, not delayed by milter-greylist-3.0 (khavrinen.csail.mit.edu [127.0.0.1]); Thu, 15 Mar 2007 08:34:13 -0500 (EST) Cc: arch@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 14:05:27 -0000 In <45F906ED.8070100@aueb.gr> you write: >$ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3) >46.5 46.2 0.314 >$ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt >27.5 46.3 0.301 "fancy algorithms have large constants, and N is usually small". Do you have any reason to believe that N is large with sufficient frequency to justify the overhead? -GAWollman From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 14:48:08 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6209416A400 for ; Thu, 15 Mar 2007 14:48:08 +0000 (UTC) (envelope-from dds@aueb.gr) Received: from blue.servers.aueb.gr (blue.servers.aueb.gr [195.251.255.132]) by mx1.freebsd.org (Postfix) with ESMTP id 1EB6C13C45E for ; Thu, 15 Mar 2007 14:48:07 +0000 (UTC) (envelope-from dds@aueb.gr) Received: from [195.251.254.141] ([::ffff:195.251.254.141]) by blue.servers.aueb.gr with esmtp; Thu, 15 Mar 2007 16:37:57 +0200 id 000D1214.45F95A45.00002DAD Message-ID: <45F95A42.7080104@aueb.gr> Date: Thu, 15 Mar 2007 16:37:54 +0200 From: Diomidis Spinellis User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.9) Gecko/20061211 SeaMonkey/1.0.7 MIME-Version: 1.0 To: Garrett Wollman References: <200703151334.l2FDYAfo024194@khavrinen.csail.mit.edu> In-Reply-To: <200703151334.l2FDYAfo024194@khavrinen.csail.mit.edu> Content-Type: text/plain; charset=ISO-8859-7; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 14:48:08 -0000 Garrett Wollman wrote: > In <45F906ED.8070100@aueb.gr> you write: > >> $ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3) >> 46.5 46.2 0.314 >> $ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt >> 27.5 46.3 0.301 > > "fancy algorithms have large constants, and N is usually small". > > Do you have any reason to believe that N is large with sufficient > frequency to justify the overhead? My proposed implementation (separate routine) doesn't add any overhead, unless you explicitly link it in, in which case I presume you know what you're doing. If I was implementing a BSD version of sort(1) (I will, one day) or using qsort in an RDBMS engine, calling qsort_mt would certainly be worthwhile. Memory and disk sizes are growing while CPU speeds are stagnant. This means larger data sizes without corresponding serial horsepower to tame them. Therefore, N is increasing in absolute and in relative terms. You do have a point though, in that parallelizing the C library isn't by itself going to save us. I run dtrace on some builds (a task I'd like to see going faster), and the time spent in libc was below 10%. Diomidis From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 15:01:58 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id CA2FF16A400; Thu, 15 Mar 2007 15:01:58 +0000 (UTC) (envelope-from dds@aueb.gr) Received: from blue.servers.aueb.gr (blue.servers.aueb.gr [195.251.255.132]) by mx1.freebsd.org (Postfix) with ESMTP id 8616413C448; Thu, 15 Mar 2007 15:01:58 +0000 (UTC) (envelope-from dds@aueb.gr) Received: from [195.251.254.141] ([::ffff:195.251.254.141]) by blue.servers.aueb.gr with esmtp; Thu, 15 Mar 2007 16:51:54 +0200 id 000D1229.45F95D8A.00002F17 Message-ID: <45F95D87.9070108@aueb.gr> Date: Thu, 15 Mar 2007 16:51:51 +0200 From: Diomidis Spinellis User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.9) Gecko/20061211 SeaMonkey/1.0.7 MIME-Version: 1.0 To: Ivan Voras References: <45F906ED.8070100@aueb.gr> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-threads@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 15:01:58 -0000 Ivan Voras wrote: > Diomidis Spinellis wrote: >> I've added multhread support to our qsort implementation. You can see >> the code at and some > > Interesting idea. I remember talks about OpenMP in gcc 4.2 - maybe it > would be better to do it with OpenMP? I'm not sure that OpenMP is mature and flexible enough for this. My understanding is that the performance of a recursive algorithm, like qsort, would depend on the compiler's support for what is termed "nested parallelism". It would certainly save me effort, but given that I've written it, this is now a moot point. I compared my implementation against the qsort available in OmpSCR (OpenMP Source Code Repository) on a Sun Niagara (Sun-Fire-T200 - 4 processors with 4 cores each). As you can see the speedup of the OpenMP implementation above two threads is nonexistent. Even for two threads, my implementation gets a speedup of 34% and the OpenMP one 26%. Of course, this can well be a problem of the specific qsort implementation. Sun C 5.8 Patch 121015-04 2007/01/10 OmpSCR c_qsort.par compiled with -xopenmp=parallel # THREADS SIZE STEPS TIME (secs.) 1 1000000 10 46.919788 2 1000000 10 34.391163 4 1000000 10 34.465052 8 1000000 10 34.427057 qsort_mt compiles with cc -O qsort_mt.c -DTEST -xc99=all -o qsort_mt # THREADS SIZE STEPS TIME (secs.) 1 50000000 1 39.5 2 50000000 1 26 4 50000000 1 20.5 8 50000000 1 16.6 16 50000000 1 16.1 Diomidis From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 17:07:45 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 6E1A916A40B for ; Thu, 15 Mar 2007 17:07:45 +0000 (UTC) (envelope-from julian@elischer.org) Received: from outB.internet-mail-service.net (outB.internet-mail-service.net [216.240.47.225]) by mx1.freebsd.org (Postfix) with ESMTP id AC57F13C4BD for ; Thu, 15 Mar 2007 17:07:44 +0000 (UTC) (envelope-from julian@elischer.org) Received: from mx0.idiom.com (HELO idiom.com) (216.240.32.160) by out.internet-mail-service.net (qpsmtpd/0.32) with ESMTP; Thu, 15 Mar 2007 09:40:29 -0700 Received: from [192.168.2.5] (home.elischer.org [216.240.48.38]) by idiom.com (Postfix) with ESMTP id 43CE5125B31; Thu, 15 Mar 2007 10:07:43 -0700 (PDT) Message-ID: <45F97D5F.2010709@elischer.org> Date: Thu, 15 Mar 2007 10:07:43 -0700 From: Julian Elischer User-Agent: Thunderbird 1.5.0.10 (Macintosh/20070221) MIME-Version: 1.0 To: Yar Tikhiy References: <39968.1173776514@critter.freebsd.dk> <40441.1173778685@critter.freebsd.dk> <20070315134300.GE28354@comp.chem.msu.su> In-Reply-To: <20070315134300.GE28354@comp.chem.msu.su> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org, Poul-Henning Kamp Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 17:07:45 -0000 Yar Tikhiy wrote: > On Tue, Mar 13, 2007 at 09:38:05AM +0000, Poul-Henning Kamp wrote: >> In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp writes: >>> It has always bothered me that some of the TAILQ macros need to >>> know the struct name of the header type. > > Yeah, can present a challenge in understanding an > implementation of basic data structures and related algos. :-) > You thought that tqe_prev points to the whole entry structure when > making the patch, didn't you? > > Personally, I cannot explain to myself why in the double-linked > structs the prev member points to the next member in the previous > list element and not to the previous list element itself. Could > anybody with CS education explain merits of the current approach? > I can only see that now we have to go to the element before the > previous one for a pointer to the latter. I'm not going to dispute > the current way of things, just curious. kirk can tell you that I believe.. > From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 17:27:55 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 0CD6516A401 for ; Thu, 15 Mar 2007 17:27:55 +0000 (UTC) (envelope-from max@love2party.net) Received: from moutng.kundenserver.de (moutng.kundenserver.de [212.227.126.188]) by mx1.freebsd.org (Postfix) with ESMTP id 7559E13C458 for ; Thu, 15 Mar 2007 17:27:54 +0000 (UTC) (envelope-from max@love2party.net) Received: from [88.64.186.120] (helo=amd64.laiers.local) by mrelayeu.kundenserver.de (node=mrelayeu8) with ESMTP (Nemesis), id 0ML31I-1HRtjw0TZu-0000eb; Thu, 15 Mar 2007 18:27:51 +0100 From: Max Laier Organization: FreeBSD To: freebsd-arch@freebsd.org Date: Thu, 15 Mar 2007 18:27:32 +0100 User-Agent: KMail/1.9.5 References: <45F906ED.8070100@aueb.gr> In-Reply-To: <45F906ED.8070100@aueb.gr> X-Face: ,,8R(x[kmU]tKN@>gtH1yQE4aslGdu+2]; R]*pL,U>^H?)gW@49@wdJ`H<=?utf-8?q?=25=7D*=5FBD=0A=09U=5For=3D=5CmOZf764=26nYj=3DJYbR1PW0ud?=>|!~,,CPC.1-D$FG@0h3#'5"k{V]a~.<=?utf-8?q?mZ=7D44=23Se=7Em=0A=09Fe=7E=5C=5DX5B=5D=5Fxj?=(ykz9QKMw_l0C2AQ]}Ym8)fU MIME-Version: 1.0 Content-Type: multipart/signed; boundary="nextPart5645397.FyxE2rQ9mp"; protocol="application/pgp-signature"; micalg=pgp-sha1 Content-Transfer-Encoding: 7bit Message-Id: <200703151827.39963.max@love2party.net> X-Provags-ID: V01U2FsdGVkX1+lropN8exSTmzfKWjeHkRJZMf83DcCdPzLYTe C5UWS1kvJsPZmiRJ/UrEMzCAz8Tte7QCzs6hX3oWbCdJmYL0EL VC2uibCBz+9YbsWR0jDqQ== Cc: Diomidis Spinellis , freebsd-threads@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 17:27:55 -0000 --nextPart5645397.FyxE2rQ9mp Content-Type: text/plain; charset="iso-8859-7" Content-Transfer-Encoding: quoted-printable Content-Disposition: inline On Thursday 15 March 2007 09:42, Diomidis Spinellis wrote: > Greetings, > > I've added multhread support to our qsort implementation. You can see > the code at and some > performance figures at the end of this email. I propose to add it to > our libc. I need your opinions on the interface, and also any comments > you may have on the code. I see the following design dimensions: > > 1. Naming and availability > -------------------------- > - Option 1.1: Replace qsort with this implementation > * Pros: Programs gain performance without any source code change; > abstraction (number of processors/cores should be invisible, just as > the CPU architecture or disk drive interface) > * Cons: May confuse multithreaded programs; programs require linking > with pthreads library; programs whose compare function is not thread > safe will break > > - Option 1.2: Name it qsort_mt > * Pros: POLA; programs can tune the call according to their need > * Cons: Programs require source code changes; we violate abstraction; > namespace polution > > My proposal: option 1.2: name it qsort_mt and make it available in a > separate library (libc_mt). I'd suggest to name it qsort() and put it in a separate library (not=20 necessarily named libc_mt, as I don't believe there are that many=20 functions in libc, that can actually gain from multithreading). This approach, would allow LD_PRELOAD'ing the _mt version without the need= =20 to recompile. That said, I'm not sure this really belongs in the base system. As=20 qsort_mt it's way too obscure and unportable for any real application to=20 use it. As a replacement for qsort it won't work (as you pointed out=20 yourself). I do think that we need efforts like this, but they should be=20 made outside of FreeBSD, otherwise they won't be much useful in general. > 2. Interface > ------------ > - Option 2.1: qsort compatible > * Pros: Drop in replacement; doesn't need programmer understanding; > compatible with option 1.1 > * Cons: Can't be tuned for a specific program or workload > > - Option 2.2: Add nthreads parameter, specifying the maximum number of > threads > * Pros: Multithreaded programs can tune load balancing; all programs > can optimise nthreads according to the workload characteristics. > * Cons: Libc routines are generaly expected to Do he Right Thing, > without handholding; must agree on mechanism for determining the number > of threads; the benefits of tuning are unlikely to be dramatic. > > - Option 2.3: Add forkelem, specifying minimum number of elements for a > new thread (below forkelem, I use single-threaded qsort). > * Pros: programs can optimise nthreads according to the workload > characteristics. > * Cons: Libc routines are generaly expected to Do he Right Thing, > without handholding; the benefits of tuning are unlikely to be > dramatic. > > - Option 2.4: Have the library tune itsself for a given system or > individual programs by measuring different values > * Pros: higher performance when the tuned values are reused > * Cons: Lack of prior art; performance drop when determining tuning > values; security considerations if values are cached; complexity > > (Options 2.2 and 2.3 are orthogonal; options 2.1 and 2.4 are > orthogonal) > > My proposal: go for 2.1, based on the way C library routines work. > > 3. Determining the number of threads > ------------------------------------ > - Option 3.1: NPROCS environment variable > * Pros: prior art: CrayOS, BSD/OS; users can tune it. > * Cons: May be too blunt; requires setting by user or administrator > > - Option 3.[234]: hw.ncpu or kern.smp.cpus sysctl, or pmc_ncpu() > * Pros: automatic, right for the underlying hardware > * Cons: even blunter > > - Option 3.5: Add library interface for the above as int nthreads_mt() > * Pros: we abstract policy; programs can replace it; reusability; > extensibility. > * Cons: namespace polution; it is not clear that a single number is > correct for all uses of a program. > > (All the above options are orthogonal) > > My proposal: Add library interface using NPROCS and falling back to a > sysctl variable. This interface can be extended in the future. > > Performance figures > ------------------- > Reported times are elapsed, user, and system seconds. > > $ uname -a > FreeBSD sledge.freebsd.org 7.0-CURRENT FreeBSD 7.0-CURRENT #924: Wed > Mar 14 14:25:07 UTC 2007 > root@sledge.freebsd.org:/h/src/sys/amd64/compile/SLEDGE amd64 > > $ qsort_mt -? > qsort_mt: option requires an argument -- h > usage: qsort_mt [-stv] [-f forkelements] [-h threads] [-n elements] > -l Run the libc version of qsort > -s Test with 20-byte strings, instead of integers > -t Print timing results > -v Verify the integer results > Defaults are 1e7 elements, 2 threads, 100 fork elements > > > $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lthr -lpmc > > $ qsort_mt -t -h 2 -n 100000000 -l # Integers; qsort(3) > 46.5 46.2 0.314 > $ ./qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt > 27.5 46.3 0.301 > $ qsort_mt -t -h 2 -n 10000000 -s -l # Strings; qsort(3) > 41.3 40.1 1.2 > $ ./qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt > 26.7 40.1 1.16 > > $ gcc -DTEST -O qsort_mt.c -o qsort_mt -lpthread -lpmc > $ qsort_mt -t -h 2 -n 100000000 # Integers; qsort_mt > 28 47.1 0.368 > $ qsort_mt -t -h 2 -n 10000000 -s # Strings; qsort_mt > 27.1 40.6 1.06 > > > Thanks, > > Diomidis - dds@ > http://www.spinellis.gr > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" =2D-=20 /"\ Best regards, | mlaier@freebsd.org \ / Max Laier | ICQ #67774661 X http://pf4freebsd.love2party.net/ | mlaier@EFnet / \ ASCII Ribbon Campaign | Against HTML Mail and News --nextPart5645397.FyxE2rQ9mp Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (FreeBSD) iD8DBQBF+YILXyyEoT62BG0RAju3AJ4vEppT5JGbBZYcyhTi/oaeTvkv2QCfV8PG zAnh7BCCQi8z2Ag7I2OAeJI= =CehS -----END PGP SIGNATURE----- --nextPart5645397.FyxE2rQ9mp-- From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 18:21:37 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 2A55C16A402; Thu, 15 Mar 2007 18:21:37 +0000 (UTC) (envelope-from gergely.czuczy@harmless.hu) Received: from marvin.harmless.hu (marvin.harmless.hu [195.56.55.204]) by mx1.freebsd.org (Postfix) with ESMTP id 8E9A413C4B9; Thu, 15 Mar 2007 18:21:36 +0000 (UTC) (envelope-from gergely.czuczy@harmless.hu) Received: from localhost (marvin-mail [192.168.0.2]) by marvin.harmless.hu (Postfix) with ESMTP id 70D117BFF20; Thu, 15 Mar 2007 18:49:48 +0100 (CET) X-Virus-Scanned: by amavisd-new-2.4.2 (20060627) (Debian) at harmless.hu Received: from marvin.harmless.hu ([192.168.0.2]) by localhost (marvin.harmless.hu [192.168.0.2]) (amavisd-new, port 10024) with ESMTP id 9WLi6L+l4vwJ; Thu, 15 Mar 2007 18:49:48 +0100 (CET) Received: from marvin.harmless.hu (localhost [127.0.0.1]) by marvin.harmless.hu (Postfix) with ESMTP id DB5A47BFD14; Thu, 15 Mar 2007 18:49:47 +0100 (CET) Date: Thu, 15 Mar 2007 18:49:47 +0100 From: Gergely CZUCZY To: Max Laier Message-ID: <20070315174947.GA4674@harmless.hu> References: <45F906ED.8070100@aueb.gr> <200703151827.39963.max@love2party.net> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=x-unknown; protocol="application/pgp-signature"; boundary="h31gzZEtNLTqOjlF" Content-Disposition: inline In-Reply-To: <200703151827.39963.max@love2party.net> User-Agent: mutt-ng/devel-r804 (FreeBSD) Cc: freebsd-threads@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 18:21:37 -0000 --h31gzZEtNLTqOjlF Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Mar 15, 2007 at 06:27:32PM +0100, Max Laier wrote: > On Thursday 15 March 2007 09:42, Diomidis Spinellis wrote: > > Greetings, > > > > I've added multhread support to our qsort implementation. You can see > > the code at and some > > performance figures at the end of this email. I propose to add it to > > our libc. I need your opinions on the interface, and also any comments > > you may have on the code. I see the following design dimensions: > > > > 1. Naming and availability > > -------------------------- > > - Option 1.1: Replace qsort with this implementation > > * Pros: Programs gain performance without any source code change; > > abstraction (number of processors/cores should be invisible, just as > > the CPU architecture or disk drive interface) > > * Cons: May confuse multithreaded programs; programs require linking > > with pthreads library; programs whose compare function is not thread > > safe will break > > > > - Option 1.2: Name it qsort_mt > > * Pros: POLA; programs can tune the call according to their need > > * Cons: Programs require source code changes; we violate abstraction; > > namespace polution > > > > My proposal: option 1.2: name it qsort_mt and make it available in a > > separate library (libc_mt). >=20 > I'd suggest to name it qsort() and put it in a separate library (not=20 > necessarily named libc_mt, as I don't believe there are that many=20 > functions in libc, that can actually gain from multithreading). >=20 > This approach, would allow LD_PRELOAD'ing the _mt version without the nee= d=20 > to recompile. >=20 > That said, I'm not sure this really belongs in the base system. As=20 > qsort_mt it's way too obscure and unportable for any real application to= =20 > use it. As a replacement for qsort it won't work (as you pointed out=20 > yourself). I do think that we need efforts like this, but they should be= =20 > made outside of FreeBSD, otherwise they won't be much useful in general. I'm just a FreeBSD user, and developing some applications under it. I'd like to share my point of view, a point of view of an application developer, of this placing. I see two useful ways placing such a function into the OS. One is naming it qsort() and putting it into a library that even can be libmaped or preloaded into some applications, thus gaining some performance on some applications that were not designed to be work in a multithreaded way (and gain performa= nce on the several CPUs). The other one is putting it into a different library, that a program can be linked against along with libc, and the old qsort and this new one can be mixed as the developers would like it to. By this way, it would be very useful to have the ability to assign control functions to the multihreaded qsort, like telling it what threads to use, how to lunch/reuse threads, how many, and so on. To be honest, i find both ways quite useful. I think maybe the best way would be to do both at the same time. Bye, Gergely Czuczy mailto: gergely.czuczy@harmless.hu --=20 Weenies test. Geniuses solve problems that arise. --h31gzZEtNLTqOjlF Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) owF1Vz1sHEUUNoloVlCYinIkCtvJ3fl8cexwxobEDlGEE1uJEUIUYXZ39m7i3Zn1 zOydNxIICQkoKBANSBQgqJEogJKehoKGlhaElIKKju/N7J3vYnBh383u+/u+7703 /uTZiwsXFn/57vu3Ln/86RdPffvMr/GlonJODdoFNyOp2mvd7lp7fWNzvb3R7vXW 1wXnm+tJ91p3fe3azX8+3NrVygnl2kd1KfrMiVO3WuZcqi2WDLmxwm1XLmtfiybv 7Ulbaiud1KrPpMqlEtNnR4YrmwnTvqkSnUo16LOTSjuRtksjleNxLqLoQLGjYdVi d7hha1dbrNftbjLuWHej39vsX+kd3mGXu8ia3jhl+1wKw8YGXvrRDgvGxqa8hjH5 SIbBQ/fF/nqvxfakLmQqLbtfIrM8x6ep8Q67ZYRwyMu26Ks/ur00EoynqUhZUeVu aARPma3KUhvHnGa6MuzE0hdZlLkoUCen4juMvakrlnDFrBDelRsKhroFVfPS0Lmy v7paCg2rTobAsU072gxW301Tu+pdPihcJ9lhXCGiLoKTUphMm4KrRLBMDiojLPkj 3wLv6QwfUZQouMyRw21WGg1CBOWKMpikrL0nyjyXceLfUgIF1nSkAQzyt0wr7xXE ICRPRMsnwnMLR6pGJQUVa70vWLICmA850GoMqVTvG/X7g0znuR4DXpYKKweKpRIe LAXrT/Fe67C7vKCXfLQRyuCxzKWr/eP2//6Ex+ygJPThZq3P7glIFTgFfsbSDQM4 80x5w0vs0Gjbp98DwwvLBtD4HNhkrivna7cAKmnIRBuogdjyXnhsneGJz2BZVUUM bYIRUJAIiyTsaqKJMAtPecpiQnckrYTwW+xhZeHdTqWye/g6I/1KJxIHnpk2QMwe s9TI0QwvK00Bu4QjJE/UqKwC5SRYGSQLdsumtK3pJ2bESSXhGV16DMi9Iw9TGaws CcRwU8/YjIekJrBfclhmlQrlAlalSYdk5x1ZnhFqec5inB1PGZ7hqNcnsgWJciL4 eTYO9q/PhKZmcpUKako4PPMEgNIkIXnjVBqv5DlEDp+s9jx7wGQs2EjqnDsxy2Kg VSFHW5KSSp1XE834R3fqpsF43kfrnNWlnqjLy7ngx/6wkXVOLDIe0BLAk6I3iLNl 6k0YrnSine1eN6JRRJNngHT95JmLsLziA5QQKI7I6394BEHBkxKkR25kXns3KWuC occtOjbVaslBnrkUI482UCO23RCjBu1QBzcT8i0FJA+t8AbxBPwqMFSHRsqMLmbl CMbO6jqinuQlcOTJsMXGvjc4zQq2v/fg8N7N/YPre0ueZPBOWI6EobEx7Uk6B+/b URo8Ah0jSKISs/UsDFKzXKYtIFl4tdrKFyVJGz5ZlKwx/qke8hlzSN3W1okCk+y6 DY6mlEq3hH5AwzmNRRDbhNwRDZWi9eAJxvzwI4MCUJG5TPzUgc12FPxRq0rnA4A2 E4YWjSdv3CwXzC9PylibY7YMlmjklpqmAKZ+1RBLE9yKPFvxgzelnkBrB1bGIkx5 kcGto9Y+DsW3WBwwrM8G0yS3AqOD3FtJfzP2KhbVjft7LaZJFWNpRTAcN4oBydi4 qCirckJxIJQwPO9EBHmYcBMf9JYJSyWFzHJaPAO/6GZxsgAzxRglgNAAUchaI1MS ZFEHDCi1kRRjuJs/oL+kxhngm2CIHU22JSGO4J0oarbVWE9qAL3T59ALiuMzQ0+F scMO7ndw9xB+CobtJV30RGO6cByM+LQrPTdISfm2iX27FrwkVg1mC1Kl6R15q3Pg UMNVYVlNwZtdWsjxnE3UqAHwUQ+EXYxw8I/oXl9+fsxvD5L5MlUytxi3I0SJmoVv UQW4psVl0d3REc68TJCGh+Y8CKnMcB8kqTdwNBOET4Z+A0pEGwpZcIpOIqI2Dasq zB3KjHLQUG/omHBChJAKlJh4KuQpObL+9akWbDN2vLz8HanDbtTBAUpvhQZsljbK rKNGHyjDX3rIW3NR8Vct62842MTO6HxmUjaC8eA22AahtJqGpEtpwGg85JOF6u0Q scWGmIr4nMPhcNWIyjegf8U/i2g6t5orI8omGjyvQ0BgEUTi1oiHMYgJ2sZOxI4I 1aDFmomBy1wciopp3eDNaFo+wmOyeA/NzdPSKnK4zSHcjVq0ouiWMAOBgbr7qEoe 1RHdRp3uYxz4407ij19BDxc5NlFnWEVRu00j5w0hlMQVySFsh93CF6SGG5POATJE gZla2EYkBuOnE3308sWnF+hflcm/OYsXHn298NVnOx/c/funhz/+cPWFK+/vvP3e 4vOf/77w5Wt/Re/8+dv65Y3dk+ce//x48Y/TC9/8Cw== =nogL -----END PGP SIGNATURE----- --h31gzZEtNLTqOjlF-- From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 18:22:43 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 9543416A401 for ; Thu, 15 Mar 2007 18:22:43 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from mail.bitblocks.com (bitblocks.com [64.142.15.60]) by mx1.freebsd.org (Postfix) with ESMTP id 7C9EB13C458 for ; Thu, 15 Mar 2007 18:22:43 +0000 (UTC) (envelope-from bakul@bitblocks.com) Received: from bitblocks.com (localhost.bitblocks.com [127.0.0.1]) by mail.bitblocks.com (Postfix) with ESMTP id 68E265B49; Thu, 15 Mar 2007 10:59:24 -0700 (PDT) To: Yar Tikhiy In-reply-to: Your message of "Thu, 15 Mar 2007 16:43:00 +0300." <20070315134300.GE28354@comp.chem.msu.su> Date: Thu, 15 Mar 2007 10:59:24 -0700 From: Bakul Shah Message-Id: <20070315175924.68E265B49@mail.bitblocks.com> Cc: arch@freebsd.org, Poul-Henning Kamp Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 18:22:43 -0000 > On Tue, Mar 13, 2007 at 09:38:05AM +0000, Poul-Henning Kamp wrote: > > In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp writes: > > > > > >It has always bothered me that some of the TAILQ macros need to > > >know the struct name of the header type. > > Yeah, can present a challenge in understanding an > implementation of basic data structures and related algos. :-) > You thought that tqe_prev points to the whole entry structure when > making the patch, didn't you? > > Personally, I cannot explain to myself why in the double-linked > structs the prev member points to the next member in the previous > list element and not to the previous list element itself. Could > anybody with CS education explain merits of the current approach? > I can only see that now we have to go to the element before the > previous one for a pointer to the latter. I'm not going to dispute > the current way of things, just curious. IIRC, the main reason is so that you don't have to know the actual type of other things on the list. In particular the list head can be just a next/prev ptr pair and not the entire object. This is handy, for example, when you have a list hanging off another object -- you can remove any list element without detaching the entire list from the parent object. The queue.h trick is quite clever and useful. To appreciate it, try figuring out a solution using Lisp style lists! IIRC, your typical CS education doesn't spend much time on pragmatics. From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 18:29:27 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B9AE916A401 for ; Thu, 15 Mar 2007 18:29:27 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (comp.chem.msu.su [158.250.32.97]) by mx1.freebsd.org (Postfix) with ESMTP id AE0B213C448 for ; Thu, 15 Mar 2007 18:29:25 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.13.4/8.13.4) with ESMTP id l2FITHH7035163; Thu, 15 Mar 2007 21:29:17 +0300 (MSK) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.13.4/8.13.4/Submit) id l2FITHTG035161; Thu, 15 Mar 2007 21:29:17 +0300 (MSK) (envelope-from yar) Date: Thu, 15 Mar 2007 21:29:16 +0300 From: Yar Tikhiy To: Julian Elischer Message-ID: <20070315182916.GG28354@comp.chem.msu.su> References: <39968.1173776514@critter.freebsd.dk> <40441.1173778685@critter.freebsd.dk> <20070315134300.GE28354@comp.chem.msu.su> <45F97D5F.2010709@elischer.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <45F97D5F.2010709@elischer.org> User-Agent: Mutt/1.5.9i Cc: arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 18:29:27 -0000 On Thu, Mar 15, 2007 at 10:07:43AM -0700, Julian Elischer wrote: > Yar Tikhiy wrote: > >On Tue, Mar 13, 2007 at 09:38:05AM +0000, Poul-Henning Kamp wrote: > >>In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp > >>writes: > >>>It has always bothered me that some of the TAILQ macros need to > >>>know the struct name of the header type. > > > >Yeah, can present a challenge in understanding an > >implementation of basic data structures and related algos. :-) > >You thought that tqe_prev points to the whole entry structure when > >making the patch, didn't you? > > > >Personally, I cannot explain to myself why in the double-linked > >structs the prev member points to the next member in the previous > >list element and not to the previous list element itself. Could > >anybody with CS education explain merits of the current approach? > >I can only see that now we have to go to the element before the > >previous one for a pointer to the latter. I'm not going to dispute > >the current way of things, just curious. > > kirk can tell you that I believe.. I'm afraid he can but has no time to. In addition, Kirk would be upset by having to deal with a bunch of losers who don't have enough collective knowledge to figure out a basic thing! We seem to be able to prove we aren't such losers. :-) -- Yar From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 19:05:05 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 91DF516A400 for ; Thu, 15 Mar 2007 19:05:05 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (comp.chem.msu.su [158.250.32.97]) by mx1.freebsd.org (Postfix) with ESMTP id 8414C13C44B for ; Thu, 15 Mar 2007 19:05:03 +0000 (UTC) (envelope-from yar@comp.chem.msu.su) Received: from comp.chem.msu.su (localhost [127.0.0.1]) by comp.chem.msu.su (8.13.4/8.13.4) with ESMTP id l2FJ4soP035499; Thu, 15 Mar 2007 22:04:54 +0300 (MSK) (envelope-from yar@comp.chem.msu.su) Received: (from yar@localhost) by comp.chem.msu.su (8.13.4/8.13.4/Submit) id l2FJ4r1r035498; Thu, 15 Mar 2007 22:04:53 +0300 (MSK) (envelope-from yar) Date: Thu, 15 Mar 2007 22:04:53 +0300 From: Yar Tikhiy To: Bakul Shah Message-ID: <20070315190453.GH28354@comp.chem.msu.su> References: <20070315134300.GE28354@comp.chem.msu.su> <20070315175924.68E265B49@mail.bitblocks.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070315175924.68E265B49@mail.bitblocks.com> User-Agent: Mutt/1.5.9i Cc: arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 19:05:05 -0000 On Thu, Mar 15, 2007 at 10:59:24AM -0700, Bakul Shah wrote: > > On Tue, Mar 13, 2007 at 09:38:05AM +0000, Poul-Henning Kamp wrote: > > > In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp writes: > > > > > > > >It has always bothered me that some of the TAILQ macros need to > > > >know the struct name of the header type. > > > > Yeah, can present a challenge in understanding an > > implementation of basic data structures and related algos. :-) > > You thought that tqe_prev points to the whole entry structure when > > making the patch, didn't you? > > > > Personally, I cannot explain to myself why in the double-linked > > structs the prev member points to the next member in the previous > > list element and not to the previous list element itself. Could > > anybody with CS education explain merits of the current approach? > > I can only see that now we have to go to the element before the > > previous one for a pointer to the latter. I'm not going to dispute > > the current way of things, just curious. > > IIRC, the main reason is so that you don't have to know the > actual type of other things on the list. In particular the > list head can be just a next/prev ptr pair and not the entire > object. This is handy, for example, when you have a list > hanging off another object -- you can remove any list element > without detaching the entire list from the parent object. > The queue.h trick is quite clever and useful. To appreciate > it, try figuring out a solution using Lisp style lists! > IIRC, your typical CS education doesn't spend much time on > pragmatics. Oh, now I seem to understand. Thank you! Here's how I see it: Let's assume that the prev member points to the whole previous element of the list or queue. E.g.: struct queue_head { type *first; type *last; }; struct queue_entry { type *next; type *prev; } In this case we can indicate the first element of the queue by prev being NULL. But then we have no way to get from the first element to the list head. This means that we have to special case such operations as removal of the first element and insertion before it. OTOH, the current approach doesn't suffer from this shortcoming. BTW, the Linux folks took a different way: They keep pointers to the list entry structure itself (they call it list_head): struct list_head { struct list_head *next, *prev; }; Consequently, they need to do pointer math to get from an embedded list_head to the containing structure itself: #define list_entry(ptr, type, member) \ ((type *)((char *)(ptr)-(unsigned long)(&((type *)0)->member))) Linked lists are fun! :-) Thank you once more for pointing me in the right direction! -- Yar From owner-freebsd-arch@FreeBSD.ORG Thu Mar 15 22:16:50 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id B5B0316A400 for ; Thu, 15 Mar 2007 22:16:50 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.freebsd.org (Postfix) with ESMTP id 7BD2813C45A for ; Thu, 15 Mar 2007 22:16:50 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id B7FB3170DC; Thu, 15 Mar 2007 22:16:48 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id l2FMGl9J001302; Thu, 15 Mar 2007 22:16:48 GMT (envelope-from phk@critter.freebsd.dk) To: Yar Tikhiy From: "Poul-Henning Kamp" In-Reply-To: Your message of "Thu, 15 Mar 2007 16:43:00 +0300." <20070315134300.GE28354@comp.chem.msu.su> Date: Thu, 15 Mar 2007 22:16:47 +0000 Message-ID: <1301.1173997007@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 15 Mar 2007 22:16:50 -0000 In message <20070315134300.GE28354@comp.chem.msu.su>, Yar Tikhiy writes: >Yeah, can present a challenge in understanding an >implementation of basic data structures and related algos. :-) >You thought that tqe_prev points to the whole entry structure when >making the patch, didn't you? Nah, I was just generally sleepy I think. >Personally, I cannot explain to myself why in the double-linked >structs the prev member points to the next member in the previous >list element and not to the previous list element itself. Could >anybody with CS education explain merits of the current approach? It allows us to isolate casts to those few backwards functions, all the forwards functions are type correct. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Fri Mar 16 00:37:28 2007 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 64B7216A402; Fri, 16 Mar 2007 00:37:28 +0000 (UTC) (envelope-from keramida@freebsd.org) Received: from igloo.linux.gr (igloo.linux.gr [62.1.205.36]) by mx1.freebsd.org (Postfix) with ESMTP id C08FF13C43E; Fri, 16 Mar 2007 00:37:27 +0000 (UTC) (envelope-from keramida@freebsd.org) Received: from kobe.laptop (host5.bedc.ondsl.gr [62.103.39.229]) (authenticated bits=128) by igloo.linux.gr (8.13.8/8.13.8/Debian-3) with ESMTP id l2G0PZgf016667 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Fri, 16 Mar 2007 02:25:41 +0200 Received: from kobe.laptop (kobe.laptop [127.0.0.1]) by kobe.laptop (8.13.8/8.13.8) with ESMTP id l2G0PHrS027895; Fri, 16 Mar 2007 02:25:28 +0200 (EET) (envelope-from keramida@freebsd.org) Received: (from keramida@localhost) by kobe.laptop (8.13.8/8.13.8/Submit) id l2G0PG6C027798; Fri, 16 Mar 2007 02:25:16 +0200 (EET) (envelope-from keramida@freebsd.org) Date: Fri, 16 Mar 2007 02:25:16 +0200 From: Giorgos Keramidas To: Max Laier Message-ID: <20070316002515.GA66354@kobe.laptop> References: <45F906ED.8070100@aueb.gr> <200703151827.39963.max@love2party.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <200703151827.39963.max@love2party.net> X-Hellug-MailScanner: Found to be clean X-Hellug-MailScanner-SpamCheck: not spam, SpamAssassin (not cached, score=-3.765, required 5, autolearn=not spam, ALL_TRUSTED -1.80, AWL 0.63, BAYES_00 -2.60) X-Hellug-MailScanner-From: keramida@freebsd.org X-Spam-Status: No Cc: freebsd-threads@freebsd.org, freebsd-arch@freebsd.org Subject: Re: Multithreaded qsort(3) X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2007 00:37:28 -0000 On 2007-03-15 18:27, Max Laier wrote: > On Thursday 15 March 2007 09:42, Diomidis Spinellis wrote: > > Greetings, > > > > I've added multhread support to our qsort implementation. You can see > > the code at and some > > performance figures at the end of this email. I propose to add it to > > our libc. I need your opinions on the interface, and also any comments > > you may have on the code. I see the following design dimensions: > > > > 1. Naming and availability > > -------------------------- > > - Option 1.1: Replace qsort with this implementation > > * Pros: Programs gain performance without any source code change; > > abstraction (number of processors/cores should be invisible, just as > > the CPU architecture or disk drive interface) > > * Cons: May confuse multithreaded programs; programs require linking > > with pthreads library; programs whose compare function is not thread > > safe will break > > > > - Option 1.2: Name it qsort_mt > > * Pros: POLA; programs can tune the call according to their need > > * Cons: Programs require source code changes; we violate abstraction; > > namespace polution > > > > My proposal: option 1.2: name it qsort_mt and make it available in a > > separate library (libc_mt). > > I'd suggest to name it qsort() and put it in a separate library (not > necessarily named libc_mt, as I don't believe there are that many > functions in libc, that can actually gain from multithreading). > > This approach, would allow LD_PRELOAD'ing the _mt version without the > need to recompile. > > That said, I'm not sure this really belongs in the base system. As > qsort_mt it's way too obscure and unportable for any real application > to use it. As a replacement for qsort it won't work (as you pointed > out yourself). I do think that we need efforts like this, but they > should be made outside of FreeBSD, otherwise they won't be much useful > in general. Does it make sense to put all the thread/performance-related functions of this sort in libmt.so and allow either explicit linking with -lmt or preloading the library whenever needed? This way we can keep generally useful MT-safe functions in libmt.so and let people choose when they use them :) From owner-freebsd-arch@FreeBSD.ORG Fri Mar 16 11:42:27 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 33A8116A402 for ; Fri, 16 Mar 2007 11:42:27 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from cyrus.watson.org (cyrus.watson.org [209.31.154.42]) by mx1.freebsd.org (Postfix) with ESMTP id 0A91D13C43E for ; Fri, 16 Mar 2007 11:42:26 +0000 (UTC) (envelope-from rwatson@FreeBSD.org) Received: from fledge.watson.org (fledge.watson.org [209.31.154.41]) by cyrus.watson.org (Postfix) with ESMTP id BB74B47454; Fri, 16 Mar 2007 06:15:28 -0500 (EST) Date: Fri, 16 Mar 2007 12:15:28 +0100 (BST) From: Robert Watson X-X-Sender: robert@fledge.watson.org To: Bakul Shah In-Reply-To: <20070315175924.68E265B49@mail.bitblocks.com> Message-ID: <20070316121449.N7579@fledge.watson.org> References: <20070315175924.68E265B49@mail.bitblocks.com> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Yar Tikhiy , Poul-Henning Kamp , arch@freebsd.org Subject: Re: bikeshed proposal X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 16 Mar 2007 11:42:27 -0000 On Thu, 15 Mar 2007, Bakul Shah wrote: >> On Tue, Mar 13, 2007 at 09:38:05AM +0000, Poul-Henning Kamp wrote: >>> In message <39968.1173776514@critter.freebsd.dk>, Poul-Henning Kamp writes: >>>> >>>> It has always bothered me that some of the TAILQ macros need to >>>> know the struct name of the header type. >> >> Yeah, can present a challenge in understanding an >> implementation of basic data structures and related algos. :-) You thought >> that tqe_prev points to the whole entry structure when making the patch, >> didn't you? >> >> Personally, I cannot explain to myself why in the double-linked structs the >> prev member points to the next member in the previous list element and not >> to the previous list element itself. Could anybody with CS education >> explain merits of the current approach? I can only see that now we have to >> go to the element before the previous one for a pointer to the latter. >> I'm not going to dispute the current way of things, just curious. > > IIRC, the main reason is so that you don't have to know the actual type of > other things on the list. In particular the list head can be just a > next/prev ptr pair and not the entire object. This is handy, for example, > when you have a list hanging off another object -- you can remove any list > element without detaching the entire list from the parent object. The > queue.h trick is quite clever and useful. To appreciate it, try figuring > out a solution using Lisp style lists! IIRC, your typical CS education > doesn't spend much time on pragmatics. Right now, the queue(9) man page talks a bit about the relative performance of the different list types. It would be interesting to reevaluate their relative performance now that instructions are relatively cheaper, and memory access is relatively more expensive. Robert N M Watson Computer Laboratory University of Cambridge From owner-freebsd-arch@FreeBSD.ORG Sat Mar 17 06:52:53 2007 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.freebsd.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5BB3C16A405 for ; Sat, 17 Mar 2007 06:52:53 +0000 (UTC) (envelope-from nate@root.org) Received: from root.org (root.org [67.118.192.226]) by mx1.freebsd.org (Postfix) with ESMTP id 42B7113C480 for ; Sat, 17 Mar 2007 06:52:53 +0000 (UTC) (envelope-from nate@root.org) Received: (qmail 9979 invoked from network); 17 Mar 2007 06:52:55 -0000 Received: from pm9-195.corp.redshift.com (HELO ?192.168.1.50?) (nate-mail@216.228.25.195) by root.org with ESMTPA; 17 Mar 2007 06:52:55 -0000 Message-ID: <45FB908A.6070006@root.org> Date: Fri, 16 Mar 2007 23:54:02 -0700 From: Nate Lawson User-Agent: Thunderbird 1.5.0.9 (X11/20070214) MIME-Version: 1.0 To: arch@freebsd.org X-Enigmail-Version: 0.94.2.0 Content-Type: multipart/mixed; boundary="------------010201050001060005050802" Cc: Subject: PATCH: cpu freq notifiers and eventhandler register from SYSINIT X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 17 Mar 2007 06:52:53 -0000 This is a multi-part message in MIME format. --------------010201050001060005050802 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Attached is an updated patch for handling cpu freq changes. It updates tsc_freq and all direct consumers of it. I also would like to add something to eventhandler.h. As I was doing this, I really wanted a way to declare an eventhandler tag and call eventhandler_register() from a SYSINIT. I ended up doing that manually in a lot of cases where I couldn't register up front because it was too early in boot (i.e. identcpu). Comments on adding this macro, similar to TASQUEUE_DEFINE in taskqueue.h? Maybe EVENTHANDLER_REGISTER_INIT()? I wasn't sure what to do for guprof since it is using TSC at times. I decided to just have it print a warning if the TSC freq changed while profiling was active. Let me know if I should do something else here. For altq, it should be correct but please check my change. I'm not sure what to do for packets that are being processed if machclk_freq changes. It might be ok. Thanks, -- Nate --------------010201050001060005050802 Content-Type: text/plain; name="cpu_event.diff" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="cpu_event.diff" Index: sys/i386/i386/tsc.c =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/tsc.c,v retrieving revision 1.206 diff -u -r1.206 tsc.c --- sys/i386/i386/tsc.c 4 Aug 2006 07:56:35 -0000 1.206 +++ sys/i386/i386/tsc.c 17 Mar 2007 01:25:14 -0000 @@ -30,6 +30,8 @@ #include "opt_clock.h" #include +#include +#include #include #include #include @@ -52,14 +54,20 @@ TUNABLE_INT("kern.timecounter.smp_tsc", &smp_tsc); #endif +static eventhandler_tag evh_pre_tag, evh_post_tag; + static unsigned tsc_get_timecount(struct timecounter *tc); +static void tsc_freq_changing(void *arg, const struct cf_level *level, + int *status); +static void tsc_freq_changed(void *arg, const struct cf_level *level, + int status); static struct timecounter tsc_timecounter = { tsc_get_timecount, /* get_timecount */ 0, /* no poll_pps */ - ~0u, /* counter_mask */ + ~0u, /* counter_mask */ 0, /* frequency */ - "TSC", /* name */ + "TSC", /* name */ 800, /* quality (adjusted in code) */ }; @@ -87,8 +95,13 @@ if (bootverbose) printf("TSC clock: %ju Hz\n", (intmax_t)tsc_freq); set_cputicker(rdtsc, tsc_freq, 1); -} + /* Register to find out about changes in CPU frequency. */ + evh_pre_tag = EVENTHANDLER_REGISTER(cpufreq_pre_change, + tsc_freq_changing, NULL, EVENTHANDLER_PRI_FIRST); + evh_post_tag = EVENTHANDLER_REGISTER(cpufreq_post_change, + tsc_freq_changed, NULL, EVENTHANDLER_PRI_FIRST); +} void init_TSC_tc(void) @@ -128,6 +141,38 @@ } } +/* + * If the TSC timecounter is in use, veto the pending change. It may be + * possible in the future to handle a dynamically-changing timecounter rate. + */ +static void +tsc_freq_changing(void *arg, const struct cf_level *level, int *status) +{ + static int once; + + if (*status == 0 && timecounter == &tsc_timecounter) { + if (!once) { + printf("timecounter TSC must not be in use when " + "changing frequencies; change denied\n"); + once = 1; + } + *status = EBUSY; + } +} + +/* Update TSC freq with the value indicated by the caller. */ +static void +tsc_freq_changed(void *arg, const struct cf_level *level, int status) +{ + /* If there was an error during the transition, don't do anything. */ + if (status != 0) + return; + + /* Total setting for this level gives the new frequency in MHz. */ + tsc_freq = level->total_set.freq * 1000000; + tsc_timecounter.tc_frequency = tsc_freq; +} + static int sysctl_machdep_tsc_freq(SYSCTL_HANDLER_ARGS) { Index: sys/i386/i386/identcpu.c =================================================================== RCS file: /home/ncvs/src/sys/i386/i386/identcpu.c,v retrieving revision 1.172 diff -u -r1.172 identcpu.c --- sys/i386/i386/identcpu.c 12 Mar 2007 20:27:21 -0000 1.172 +++ sys/i386/i386/identcpu.c 17 Mar 2007 06:30:34 -0000 @@ -45,6 +45,8 @@ #include #include +#include +#include #include #include #include @@ -101,6 +103,9 @@ &hw_clockrate, 0, "CPU instruction clock rate"); static char cpu_brand[48]; +static eventhandler_tag evh_post_tag; +static void tsc_freq_changed(void *arg, const struct cf_level *level, + int status); #define MAX_BRAND_INDEX 8 @@ -1077,6 +1082,29 @@ write_eflags(eflags); } +/* XXX Register with sysinit should be a macro in eventhandler.h */ +static void +ident_tsc_check(void *arg) +{ + + evh_post_tag = EVENTHANDLER_REGISTER(cpufreq_post_change, + tsc_freq_changed, NULL, EVENTHANDLER_PRI_ANY); +} +SYSINIT(ident_tsc_check, SI_SUB_CONFIGURE, SI_ORDER_ANY, ident_tsc_check, + NULL); + +/* Update TSC freq with the value indicated by the caller. */ +static void +tsc_freq_changed(void *arg, const struct cf_level *level, int status) +{ + /* If there was an error during the transition, don't do anything. */ + if (status != 0) + return; + + /* Total setting for this level gives the new frequency in MHz. */ + hw_clockrate = level->total_set.freq; +} + /* * Final stage of CPU identification. -- Should I check TI? */ Index: sys/kern/kern_cpu.c =================================================================== RCS file: /home/ncvs/src/sys/kern/kern_cpu.c,v retrieving revision 1.23 diff -u -r1.23 kern_cpu.c --- sys/kern/kern_cpu.c 3 Mar 2006 02:06:04 -0000 1.23 +++ sys/kern/kern_cpu.c 17 Mar 2007 01:23:40 -0000 @@ -1,5 +1,5 @@ /*- - * Copyright (c) 2004-2005 Nate Lawson (SDG) + * Copyright (c) 2004-2007 Nate Lawson (SDG) * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -95,7 +95,6 @@ static int cpufreq_attach(device_t dev); static int cpufreq_detach(device_t dev); -static void cpufreq_evaluate(void *arg); static int cf_set_method(device_t dev, const struct cf_level *level, int priority); static int cf_get_method(device_t dev, struct cf_level *level); @@ -127,8 +126,6 @@ static devclass_t cpufreq_dc; DRIVER_MODULE(cpufreq, cpu, cpufreq_driver, cpufreq_dc, 0, 0); -static eventhandler_tag cf_ev_tag; - static int cf_lowest_freq; static int cf_verbose; TUNABLE_INT("debug.cpufreq.lowest", &cf_lowest_freq); @@ -176,8 +173,6 @@ SYSCTL_CHILDREN(device_get_sysctl_tree(parent)), OID_AUTO, "freq_levels", CTLTYPE_STRING | CTLFLAG_RD, sc, 0, cpufreq_levels_sysctl, "A", "CPU frequency levels"); - cf_ev_tag = EVENTHANDLER_REGISTER(cpufreq_changed, cpufreq_evaluate, - NULL, EVENTHANDLER_PRI_ANY); return (0); } @@ -202,18 +197,11 @@ numdevs = devclass_get_count(cpufreq_dc); if (numdevs == 1) { CF_DEBUG("final shutdown for %s\n", device_get_nameunit(dev)); - EVENTHANDLER_DEREGISTER(cpufreq_changed, cf_ev_tag); } return (0); } -static void -cpufreq_evaluate(void *arg) -{ - /* TODO: Re-evaluate when notified of changes to drivers. */ -} - static int cf_set_method(device_t dev, const struct cf_level *level, int priority) { @@ -222,26 +210,16 @@ struct cf_saved_freq *saved_freq, *curr_freq; struct pcpu *pc; int cpu_id, error, i; - static int once; sc = device_get_softc(dev); error = 0; set = NULL; saved_freq = NULL; - /* - * Check that the TSC isn't being used as a timecounter. - * If it is, then return EBUSY and refuse to change the - * clock speed. - */ - if (strcmp(timecounter->tc_name, "TSC") == 0) { - if (!once) { - printf("cpufreq: frequency change with timecounter" - " TSC not allowed, see cpufreq(4)\n"); - once = 1; - } - return (EBUSY); - } + /* We are going to change levels so notify the pre-change handler. */ + EVENTHANDLER_INVOKE(cpufreq_pre_change, level, &error); + if (error != 0) + return (error); CF_MTX_LOCK(&sc->lock); @@ -378,8 +356,15 @@ out: CF_MTX_UNLOCK(&sc->lock); + + /* + * We changed levels (or attempted to) so notify the post-change + * handler of new frequency or error. + */ + EVENTHANDLER_INVOKE(cpufreq_post_change, level, error); if (error && set) device_printf(set->dev, "set freq failed, err %d\n", error); + return (error); } Index: sys/sys/cpu.h =================================================================== RCS file: /home/ncvs/src/sys/sys/cpu.h,v retrieving revision 1.3 diff -u -r1.3 cpu.h --- sys/sys/cpu.h 19 Feb 2005 06:13:25 -0000 1.3 +++ sys/sys/cpu.h 17 Mar 2007 01:44:18 -0000 @@ -1,5 +1,5 @@ /*- - * Copyright (c) 2005 Nate Lawson (SDG) + * Copyright (c) 2005-2007 Nate Lawson (SDG) * All rights reserved. * * Redistribution and use in source and binary forms, with or without @@ -29,6 +29,8 @@ #ifndef _SYS_CPU_H_ #define _SYS_CPU_H_ +#include + /* * CPU device support. */ @@ -118,6 +120,12 @@ int cpufreq_register(device_t dev); int cpufreq_unregister(device_t dev); +/* Eventhandlers that are called before and after a change in frequency */ +typedef void (*cpufreq_pre_notify_fn)(void *, const struct cf_level *, int *); +typedef void (*cpufreq_post_notify_fn)(void *, const struct cf_level *, int); +EVENTHANDLER_DECLARE(cpufreq_pre_change, cpufreq_pre_notify_fn); +EVENTHANDLER_DECLARE(cpufreq_post_change, cpufreq_post_notify_fn); + /* Allow values to be +/- a bit since sometimes we have to estimate. */ #define CPUFREQ_CMP(x, y) (abs((x) - (y)) < 25) Index: sys/contrib/altq/altq/altq_subr.c =================================================================== RCS file: /home/ncvs/src/sys/contrib/altq/altq/altq_subr.c,v retrieving revision 1.8 diff -u -r1.8 altq_subr.c --- sys/contrib/altq/altq/altq_subr.c 2 Mar 2006 00:51:39 -0000 1.8 +++ sys/contrib/altq/altq/altq_subr.c 17 Mar 2007 02:10:01 -0000 @@ -74,6 +74,9 @@ #if __FreeBSD__ < 3 #include "opt_cpu.h" /* for FreeBSD-2.2.8 to get i586_ctr_freq */ #endif +#include +#include +#include #include #endif #if defined(__i386__) @@ -898,6 +901,22 @@ extern u_int64_t cpu_tsc_freq; #endif /* __alpha__ */ +#if defined(__FreeBSD__) && (__FreeBSD_version > 700030) +static eventhandler_tag evh_post_tag; + +/* Update TSC freq with the value indicated by the caller. */ +static void +tsc_freq_changed(void *arg, const struct cf_level *level, int status) +{ + /* If there was an error during the transition, don't do anything. */ + if (status != 0) + return; + + /* Total setting for this level gives the new frequency in MHz. */ + machclk_freq = level->total_set.freq * 1000000; +} +#endif /* __FreeBSD_version > 700030 */ + void init_machclk(void) { @@ -941,6 +960,11 @@ #ifdef __FreeBSD__ #if (__FreeBSD_version > 300000) machclk_freq = tsc_freq; +#if (__FreeBSD_version > 700030) + /* Register to see any changes in TSC freq. */ + evh_post_tag = EVENTHANDLER_REGISTER(cpufreq_post_change, + tsc_freq_changed, NULL, EVENTHANDLER_PRI_ANY); +#endif #else machclk_freq = i586_ctr_freq; #endif Index: sys/i386/isa/prof_machdep.c =================================================================== RCS file: /home/ncvs/src/sys/i386/isa/prof_machdep.c,v retrieving revision 1.29 diff -u -r1.29 prof_machdep.c --- sys/i386/isa/prof_machdep.c 29 Oct 2006 09:48:44 -0000 1.29 +++ sys/i386/isa/prof_machdep.c 17 Mar 2007 06:06:17 -0000 @@ -33,6 +33,9 @@ #include #include +#include +#include +#include #include #include #include @@ -50,6 +53,7 @@ int cputime_bias = 1; /* initialize for locality of reference */ +static int prof_active; static int cputime_clock = CPUTIME_CLOCK_UNINITIALIZED; #if defined(PERFMON) && defined(I586_PMC_GUPROF) static u_int cputime_clock_pmc_conf = I586_PMC_GUPROF; @@ -58,6 +62,10 @@ #endif #endif /* GUPROF */ +static eventhandler_tag evh_post_tag; +static void tsc_freq_changed(void *arg, const struct cf_level *level, + int status); + #ifdef __GNUCLIKE_ASM __asm(" \n\ GM_STATE = 0 \n\ @@ -287,7 +295,7 @@ if (cputime_clock == CPUTIME_CLOCK_UNINITIALIZED) { cputime_clock = CPUTIME_CLOCK_I8254; #if defined(I586_CPU) || defined(I686_CPU) - if (tsc_freq != 0 && !tsc_is_broken && mp_ncpus < 2) + if (tsc_freq != 0 && !tsc_is_broken && mp_ncpus == 1) cputime_clock = CPUTIME_CLOCK_TSC; #endif } @@ -322,6 +330,7 @@ } #endif /* PERFMON && I586_PMC_GUPROF */ #endif /* I586_CPU || I686_CPU */ + prof_active = 1; cputime_bias = 0; cputime(); } @@ -337,5 +346,32 @@ cputime_clock_pmc_init = FALSE; } #endif + prof_active = 0; +} + +/* XXX Register with sysinit should be a macro in eventhandler.h */ +static void +guprof_tsc_check(void *arg) +{ + + evh_post_tag = EVENTHANDLER_REGISTER(cpufreq_post_change, + tsc_freq_changed, NULL, EVENTHANDLER_PRI_ANY); +} +SYSINIT(guprof_tsc_check, SI_SUB_CONFIGURE, SI_ORDER_ANY, guprof_tsc_check, + NULL); + +/* If the cpu frequency changed while profiling, report a warning. */ +static void +tsc_freq_changed(void *arg, const struct cf_level *level, int status) +{ + static int once; + + /* If there was an error during the transition, don't do anything. */ + if (status != 0) + return; + if (prof_active && cputime_clock == CPUTIME_CLOCK_TSC && !once) { + printf("warning: cpu freq changed while profiling active\n"); + once = 1; + } } #endif /* GUPROF */ --------------010201050001060005050802--