From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 17:20:56 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8D67116A403; Mon, 13 Nov 2006 17:20:56 +0000 (UTC) (envelope-from rodrigc@crodrigues.org) Received: from rwcrmhc13.comcast.net (rwcrmhc13.comcast.net [204.127.192.83]) by mx1.FreeBSD.org (Postfix) with ESMTP id 89C7843E87; Mon, 13 Nov 2006 17:15:33 +0000 (GMT) (envelope-from rodrigc@crodrigues.org) Received: from dibbler.crodrigues.org (c-66-31-35-94.hsd1.ma.comcast.net[66.31.35.94]) by comcast.net (rwcrmhc13) with ESMTP id <20061113171522m13004mm4ge>; Mon, 13 Nov 2006 17:15:22 +0000 Received: from dibbler.crodrigues.org (localhost.crodrigues.org [127.0.0.1]) by dibbler.crodrigues.org (8.13.8/8.13.8) with ESMTP id kADHFXJj095392; Mon, 13 Nov 2006 12:15:33 -0500 (EST) (envelope-from rodrigc@c-66-31-35-94.hsd1.ma.comcast.net) Received: (from rodrigc@localhost) by dibbler.crodrigues.org (8.13.8/8.13.8/Submit) id kADHFXcZ095391; Mon, 13 Nov 2006 12:15:33 -0500 (EST) (envelope-from rodrigc) Date: Mon, 13 Nov 2006 12:15:32 -0500 From: Craig Rodrigues To: Tom Rhodes Message-ID: <20061113171532.GA95344@crodrigues.org> References: <20061107091128.063d0ae5.trhodes@FreeBSD.org> <20061109220429.14b933dd.trhodes@FreeBSD.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20061109220429.14b933dd.trhodes@FreeBSD.org> User-Agent: Mutt/1.4.2.1i Cc: arch@freebsd.org, standards@freebsd.org Subject: Re: New Patch [was: Re: cvs rm sys/posix4 && enable sem] X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 17:20:56 -0000 On Thu, Nov 09, 2006 at 10:04:29PM -0500, Tom Rhodes wrote: > > > > 1: Repocopy posix4/* files to sys/sys and sys/kern; There is a task on the C99 and POSIX Conformance project to do this, but no one took this task on until you did: http://www.freebsd.org/projects/c99/ Since you've done this, you might want to update the status of this task. The page is in CVS: http://www.freebsd.org/cgi/cvsweb.cgi/www/en/projects/c99/ -- Craig Rodrigues rodrigc@crodrigues.org From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 20:55:22 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id D4FF916A4F8 for ; Mon, 13 Nov 2006 20:55:22 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 8E31043D64 for ; Mon, 13 Nov 2006 20:53:43 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id D6BE1170C6 for ; Mon, 13 Nov 2006 20:53:41 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kADKrfuM007106 for ; Mon, 13 Nov 2006 20:53:41 GMT (envelope-from phk@critter.freebsd.dk) To: arch@freebsd.org From: Poul-Henning Kamp Date: Mon, 13 Nov 2006 20:53:41 +0000 Message-ID: <7105.1163451221@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: Subject: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 20:55:23 -0000 A number of problems have been identified with our current callout code and I have been thinking about and discussed various aspects with people during the EuroBSDcon2007 conference. A lot of people are interested in this, so here is a quick sketch of what I'm thinking about: The Problems ------------ 1. We need better resolution than a periodic "hz" clock can give us. Highspeed networking, gaming servers and other real-time apps want this. 2. We "pollute" our call-wheel with tons of callouts that we know are unlikely to happen. 3. We have many operations on the callout wheel because certain callouts gets rearmed for later in the future. (TCP keepalives). 4. We execute all callouts on one CPU only. 5. Most of the specified timeouts are bogus, because of the imprecision inheret in the current 1/hz method of scheduling them. and a number of other issues. The proposed API ---------------- tick_t XXX_ns_tick(unsigned nsec, unsigned *low, unsigned *high); Caculate the tick value for a given timeout. Optionally return theoretical lower and upper limits to actual value, tick_t XXX_s_tick(unsigned seconds) Caculate the tick value for a given timeout. The point behind these two functions is that we do not want to incur a scaling operating at every arming of a callout. Very few callouts use varying timeouts (and for those, no avoidance is possible), but for the rest, precalculating the correct (opaque) number is a good optimization. XXX_arm(struct xxx*, tick_t, func *, arg *, int flag, struct mtx *); Arm timer. Struct xxx must be zeroed before first call. If mtx pointer is non-NULL, acq mutex before calling. flags: XXX_REPEAT XXX_UNLIKELY Arm a callout with a number of optional behaviours specified. XXX_rearm(struct xxx*, tick_t) Rearm timer. XXX_disarm(struct xxx*) Unarm the timer. XXX_drain(struct xxx*) Drain the timer. The functions above will actually be wrappers for a more generic set of the same family, which also takes a pointer to a callout-group. This is so that we can have different groups of callouts, for instance one group for the/each netstack and one for the disk-I/O stuff etc. Implementation -------------- Behind the scenes, we will have to support hardware that only has a HZ style periodic interrupt but also hardware that can do deadline interrupts (like HPET). Short callouts, less than 2 seconds, will be stored in a binary heap (A tree where a node is numerically lower than its parents.) The depth of the heap is Log2(nodes) and there are very efficient ways to store and access the heap. Locking will be with one mutex for the heap. The top element which is always the next one to be executed, will be left in place during execution, so that any rescheduling (automatic or explicit) will only have to do as little work as necessary to trickle it down to the right place in the heap. Rescheduling a callout is a matter of trickling it up- or downwards in the tree to a correct position. For the long callouts, and for callouts unlikely to happen, a group of lists are used for storage. Imagine the number of lists is four, we then label them with Tnow+2s, Tnow+8s, Tnow+32s and "the rest". Armed callouts go into the first list that is labeled with a higher strike time. If a callout is rescheduled to later, it's timeout is updated, but it is not moved in the list. If a callout is cancled, it is removed from the list. Twice per second, the first list is scanned, and any due callouts called and any callouts later than the lists strike time is moved into the right list. When "Tnow+2s" rolls around, the lists are rotated to the left: the Tnow+8s becomes "Tnow+2s" etc. The idea behind this (untried) scheme for the long callouts, is to distribute the callouts somewhat evenly in the lists, while maintaining only the relevant entries in the first list, the point being that most of them (TCP keepalive, CAM etc) will never happen, so spending time sorting them more than necessary is pointless. Obviously, this algorithm needs to be tested in practice and tuned/changed/discared depending on the results. All numbers given are subject to tuning. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 21:40:01 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5CD5716A4CE for ; Mon, 13 Nov 2006 21:40:01 +0000 (UTC) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (xorpc.icir.org [192.150.187.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id 393DB43D67 for ; Mon, 13 Nov 2006 21:33:01 +0000 (GMT) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (localhost [127.0.0.1]) by xorpc.icir.org (8.12.11/8.13.6) with ESMTP id kADLWbLv029073; Mon, 13 Nov 2006 13:32:37 -0800 (PST) (envelope-from rizzo@xorpc.icir.org) Received: (from rizzo@localhost) by xorpc.icir.org (8.12.11/8.12.3/Submit) id kADLWbt1029072; Mon, 13 Nov 2006 13:32:37 -0800 (PST) (envelope-from rizzo) Date: Mon, 13 Nov 2006 13:32:37 -0800 From: Luigi Rizzo To: Poul-Henning Kamp Message-ID: <20061113133236.A28926@xorpc.icir.org> References: <7105.1163451221@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <7105.1163451221@critter.freebsd.dk>; from phk@phk.freebsd.dk on Mon, Nov 13, 2006 at 08:53:41PM +0000 Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 21:40:01 -0000 On Mon, Nov 13, 2006 at 08:53:41PM +0000, Poul-Henning Kamp wrote: > > A number of problems have been identified with our current callout > code and I have been thinking about and discussed various aspects > with people during the EuroBSDcon2007 conference. > > A lot of people are interested in this, so here is a quick sketch > of what I'm thinking about: ...
i am a bit curious on why you want to split the callouts among multiple data structures. Basically the heap that you suggest is a very nice and efficient one, (it is used in dummynet among other places), and has O(log N) cost per insertion/deletion. Say you have 64k callouts in total, even if you manage to trim the number of events that you put in the heap to 128-256, you only halve the (small) cost of updates to the heap, but the periodic work for moving events from the extra queues towards the heap can be large, and especially you need to be careful and scatter it, a bit at each tick, to avoid having to scan 10k+ entries at once. cheers luigi > The Problems > ------------ > > 1. We need better resolution than a periodic "hz" clock can give us. > Highspeed networking, gaming servers and other real-time apps want > this. > > 2. We "pollute" our call-wheel with tons of callouts that we know are > unlikely to happen. > > 3. We have many operations on the callout wheel because certain > callouts gets rearmed for later in the future. (TCP keepalives). > > 4. We execute all callouts on one CPU only. > > 5. Most of the specified timeouts are bogus, because of the imprecision > inheret in the current 1/hz method of scheduling them. > > and a number of other issues. > > > The proposed API > ---------------- > > tick_t XXX_ns_tick(unsigned nsec, unsigned *low, unsigned *high); > Caculate the tick value for a given timeout. > Optionally return theoretical lower and upper limits to > actual value, > > tick_t XXX_s_tick(unsigned seconds) > Caculate the tick value for a given timeout. > > The point behind these two functions is that we do not want to > incur a scaling operating at every arming of a callout. Very > few callouts use varying timeouts (and for those, no avoidance > is possible), but for the rest, precalculating the correct > (opaque) number is a good optimization. > > XXX_arm(struct xxx*, tick_t, func *, arg *, int flag, struct mtx *); > Arm timer. > Struct xxx must be zeroed before first call. > > If mtx pointer is non-NULL, acq mutex before calling. > > flags: > XXX_REPEAT > XXX_UNLIKELY > > Arm a callout with a number of optional behaviours specified. > > XXX_rearm(struct xxx*, tick_t) > Rearm timer. > > XXX_disarm(struct xxx*) > Unarm the timer. > > XXX_drain(struct xxx*) > Drain the timer. > > > The functions above will actually be wrappers for a more generic > set of the same family, which also takes a pointer to a callout-group. > > This is so that we can have different groups of callouts, for > instance one group for the/each netstack and one for the disk-I/O > stuff etc. > > > Implementation > -------------- > > Behind the scenes, we will have to support hardware that only > has a HZ style periodic interrupt but also hardware that > can do deadline interrupts (like HPET). > > Short callouts, less than 2 seconds, will be stored in a binary > heap (A tree where a node is numerically lower than its parents.) > > The depth of the heap is Log2(nodes) and there are very efficient > ways to store and access the heap. > > Locking will be with one mutex for the heap. > > The top element which is always the next one to be executed, > will be left in place during execution, so that any rescheduling > (automatic or explicit) will only have to do as little work as > necessary to trickle it down to the right place in the heap. > > Rescheduling a callout is a matter of trickling it up- or downwards > in the tree to a correct position. > > For the long callouts, and for callouts unlikely to happen, a group > of lists are used for storage. > > Imagine the number of lists is four, we then label them with Tnow+2s, > Tnow+8s, Tnow+32s and "the rest". > > Armed callouts go into the first list that is labeled with a higher > strike time. > > If a callout is rescheduled to later, it's timeout is updated, but > it is not moved in the list. > > If a callout is cancled, it is removed from the list. > > Twice per second, the first list is scanned, and any due callouts > called and any callouts later than the lists strike time is moved > into the right list. > > When "Tnow+2s" rolls around, the lists are rotated to the left: > the Tnow+8s becomes "Tnow+2s" etc. > > The idea behind this (untried) scheme for the long callouts, is to > distribute the callouts somewhat evenly in the lists, while maintaining > only the relevant entries in the first list, the point being that > most of them (TCP keepalive, CAM etc) will never happen, so spending > time sorting them more than necessary is pointless. Obviously, > this algorithm needs to be tested in practice and tuned/changed/discared > depending on the results. > > All numbers given are subject to tuning. > > -- > Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 > phk@FreeBSD.ORG | TCP/IP since RFC 956 > FreeBSD committer | BSD since 4.3-tahoe > Never attribute to malice what can adequately be explained by incompetence. > _______________________________________________ > freebsd-arch@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-arch > To unsubscribe, send any mail to "freebsd-arch-unsubscribe@freebsd.org" From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 21:43:35 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 97AF916A4C9 for ; Mon, 13 Nov 2006 21:43:35 +0000 (UTC) (envelope-from cperciva@freebsd.org) Received: from pd3mo1so.prod.shaw.ca (shawidc-mo1.cg.shawcable.net [24.71.223.10]) by mx1.FreeBSD.org (Postfix) with ESMTP id 21F9443D77 for ; Mon, 13 Nov 2006 21:38:16 +0000 (GMT) (envelope-from cperciva@freebsd.org) Received: from pd2mr2so.prod.shaw.ca (pd2mr2so-qfe3.prod.shaw.ca [10.0.141.109]) by l-daemon (Sun ONE Messaging Server 6.0 HotFix 1.01 (built Mar 15 2004)) with ESMTP id <0J8O009MZUREY300@l-daemon> for arch@freebsd.org; Mon, 13 Nov 2006 14:38:02 -0700 (MST) Received: from pn2ml2so.prod.shaw.ca ([10.0.121.146]) by pd2mr2so.prod.shaw.ca (Sun ONE Messaging Server 6.0 HotFix 1.01 (built Mar 15 2004)) with ESMTP id <0J8O0035PUREW510@pd2mr2so.prod.shaw.ca> for arch@freebsd.org; Mon, 13 Nov 2006 14:38:02 -0700 (MST) Received: from hexahedron.daemonology.net ([24.82.18.31]) by l-daemon (Sun ONE Messaging Server 6.0 HotFix 1.01 (built Mar 15 2004)) with SMTP id <0J8O00EYEURDHJC1@l-daemon> for arch@freebsd.org; Mon, 13 Nov 2006 14:38:02 -0700 (MST) Received: (qmail 4314 invoked from network); Mon, 13 Nov 2006 21:37:55 +0000 Received: from unknown (HELO ?127.0.0.1?) (127.0.0.1) by localhost with SMTP; Mon, 13 Nov 2006 21:37:55 +0000 Date: Mon, 13 Nov 2006 13:37:55 -0800 From: Colin Percival In-reply-to: <7105.1163451221@critter.freebsd.dk> To: Poul-Henning Kamp Message-id: <4558E5B3.1000003@freebsd.org> MIME-version: 1.0 Content-type: text/plain; charset=ISO-8859-1 Content-transfer-encoding: 7bit X-Enigmail-Version: 0.94.0.0 References: <7105.1163451221@critter.freebsd.dk> User-Agent: Thunderbird 1.5 (X11/20060416) Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 21:43:35 -0000 Poul-Henning Kamp wrote: > XXX_arm(struct xxx*, tick_t, func *, arg *, int flag, struct mtx *); > Arm timer. If we (meaning you) are going to redesign the callout code, I think it would be great if the API provided some mechanism for specifying the required callback accuracy; for example "I'd like to be called back no later than 3 seconds from now, but any time after 2 seconds would be fine". This would allow more callbacks to be performed during each wakeup of the softclock thread, thereby amortizing the context switch overhead and increasing the average time when an otherwise idle cpu can sleep. Colin Percival From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 21:44:27 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5845C16A4D0 for ; Mon, 13 Nov 2006 21:44:27 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9A03A44116 for ; Mon, 13 Nov 2006 21:38:30 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 57B64170C5; Mon, 13 Nov 2006 21:38:22 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kADLcL5q007328; Mon, 13 Nov 2006 21:38:22 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 13 Nov 2006 13:32:37 PST." <20061113133236.A28926@xorpc.icir.org> Date: Mon, 13 Nov 2006 21:38:21 +0000 Message-ID: <7327.1163453901@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 21:44:27 -0000 In message <20061113133236.A28926@xorpc.icir.org>, Luigi Rizzo writes: >On Mon, Nov 13, 2006 at 08:53:41PM +0000, Poul-Henning Kamp wrote: >> >i am a bit curious on why you want to split the callouts among >multiple data structures. A binary heap is optimal for the timeouts that will happen, but filling it up with timeouts that are unlikely to, and in most cases won't happen for a very long time will soak up CPU time used for pointlessly ordering the heap. Also, many of the "non-happening" timeouts are repeatedly rescheduled, the TCP keepalives for instance, having a data structure where this is free of cost is a big advantage. The other thing is that covering the entire range from hour long callouts to nanosecond callouts would require a 64 bit value or a tricky pseudo-FP encoding. By splitting them in two classes, I can use two different 31 bit encodings separated by the top bit. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 21:47:35 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 5321216A412; Mon, 13 Nov 2006 21:47:35 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 5C77C43EEB; Mon, 13 Nov 2006 21:45:41 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 7E0E2170C6; Mon, 13 Nov 2006 21:45:34 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kADLjYQr007379; Mon, 13 Nov 2006 21:45:34 GMT (envelope-from phk@critter.freebsd.dk) To: Colin Percival From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 13 Nov 2006 13:37:55 PST." <4558E5B3.1000003@freebsd.org> Date: Mon, 13 Nov 2006 21:45:34 +0000 Message-ID: <7378.1163454334@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 21:47:35 -0000 In message <4558E5B3.1000003@freebsd.org>, Colin Percival writes: >Poul-Henning Kamp wrote: >> XXX_arm(struct xxx*, tick_t, func *, arg *, int flag, struct mtx *); >> Arm timer. > >If we (meaning you) are going to redesign the callout code, I think it >would be great if the API provided some mechanism for specifying the >required callback accuracy; for example "I'd like to be called back no >later than 3 seconds from now, but any time after 2 seconds would be >fine". I thought about something like that, but I fear that the extra math will soak up any advantage there might be for short callouts. For long timeouts, there could be some merit to the argument, if we didn't have tons of short timeouts happening all the time to begin with. But do notice that I plan to run the "slow" callouts only twice per second (to minimize rounding errors). That could actually be a sysctl, so that one could increase the granularity of their execution if so desired. I have added a facility that points the other way: You can get a window in which you should expect your (short) callout to happen. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 22:49:09 2006 Return-Path: X-Original-To: arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 72CE316A4EB; Mon, 13 Nov 2006 22:49:09 +0000 (UTC) (envelope-from trhodes@FreeBSD.org) Received: from pittgoth.com (ns1.pittgoth.com [216.38.206.188]) by mx1.FreeBSD.org (Postfix) with ESMTP id 0D75843D45; Mon, 13 Nov 2006 22:48:38 +0000 (GMT) (envelope-from trhodes@FreeBSD.org) Received: from localhost (ip70-177-190-239.dc.dc.cox.net [70.177.190.239]) (authenticated bits=0) by pittgoth.com (8.13.6/8.13.6) with ESMTP id kADMmHAQ019935 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 13 Nov 2006 17:48:18 -0500 (EST) (envelope-from trhodes@FreeBSD.org) Date: Mon, 13 Nov 2006 17:48:15 -0500 From: Tom Rhodes To: Craig Rodrigues Message-Id: <20061113174815.5f00464a.trhodes@FreeBSD.org> In-Reply-To: <20061113171532.GA95344@crodrigues.org> References: <20061107091128.063d0ae5.trhodes@FreeBSD.org> <20061109220429.14b933dd.trhodes@FreeBSD.org> <20061113171532.GA95344@crodrigues.org> Organization: The FreeBSD Project X-Mailer: Sylpheed version 1.0.6 (GTK+ 1.2.10; i386-portbld-freebsd7.0) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: trhodes@FreeBSD.org, standards@FreeBSD.org, arch@FreeBSD.org Subject: Re: New Patch [was: Re: cvs rm sys/posix4 && enable sem] X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 22:49:09 -0000 On Mon, 13 Nov 2006 12:15:32 -0500 Craig Rodrigues wrote: > On Thu, Nov 09, 2006 at 10:04:29PM -0500, Tom Rhodes wrote: > > > > > > 1: Repocopy posix4/* files to sys/sys and sys/kern; > > There is a task on the C99 and POSIX Conformance project > to do this, but no one took this task on until you did: > > http://www.freebsd.org/projects/c99/ > > Since you've done this, you might want to update the status of > this task. The page is in CVS: > http://www.freebsd.org/cgi/cvsweb.cgi/www/en/projects/c99/ Yep, done, thanks! -- Tom Rhodes From owner-freebsd-arch@FreeBSD.ORG Mon Nov 13 23:07:20 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id F034716A415 for ; Mon, 13 Nov 2006 23:07:20 +0000 (UTC) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (gate.funkthat.com [69.17.45.168]) by mx1.FreeBSD.org (Postfix) with ESMTP id 091BC43E35 for ; Mon, 13 Nov 2006 23:05:04 +0000 (GMT) (envelope-from jmg@hydrogen.funkthat.com) Received: from hydrogen.funkthat.com (rqq0vflwxhue1l8x@localhost.funkthat.com [127.0.0.1]) by hydrogen.funkthat.com (8.13.6/8.13.3) with ESMTP id kADN4uWo051917; Mon, 13 Nov 2006 15:04:56 -0800 (PST) (envelope-from jmg@hydrogen.funkthat.com) Received: (from jmg@localhost) by hydrogen.funkthat.com (8.13.6/8.13.3/Submit) id kADN4t8C051916; Mon, 13 Nov 2006 15:04:55 -0800 (PST) (envelope-from jmg) Date: Mon, 13 Nov 2006 15:04:55 -0800 From: John-Mark Gurney To: Poul-Henning Kamp Message-ID: <20061113230455.GP9291@funkthat.com> Mail-Followup-To: Poul-Henning Kamp , arch@freebsd.org References: <7105.1163451221@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7105.1163451221@critter.freebsd.dk> User-Agent: Mutt/1.4.2.1i X-Operating-System: FreeBSD 5.4-RELEASE-p6 i386 X-PGP-Fingerprint: B7 EC EF F8 AE ED A7 31 96 7A 22 B3 D8 56 36 F4 X-Files: The truth is out there X-URL: http://resnet.uoregon.edu/~gurney_j/ X-Resume: http://resnet.uoregon.edu/~gurney_j/resume.html Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: John-Mark Gurney List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Nov 2006 23:07:21 -0000 Poul-Henning Kamp wrote this message on Mon, Nov 13, 2006 at 20:53 +0000: > Imagine the number of lists is four, we then label them with Tnow+2s, > Tnow+8s, Tnow+32s and "the rest". > > Armed callouts go into the first list that is labeled with a higher > strike time. > > If a callout is rescheduled to later, it's timeout is updated, but > it is not moved in the list. > > If a callout is cancled, it is removed from the list. > > Twice per second, the first list is scanned, and any due callouts > called and any callouts later than the lists strike time is moved > into the right list. > > When "Tnow+2s" rolls around, the lists are rotated to the left: > the Tnow+8s becomes "Tnow+2s" etc. > > The idea behind this (untried) scheme for the long callouts, is to > distribute the callouts somewhat evenly in the lists, while maintaining > only the relevant entries in the first list, the point being that > most of them (TCP keepalive, CAM etc) will never happen, so spending > time sorting them more than necessary is pointless. Obviously, > this algorithm needs to be tested in practice and tuned/changed/discared > depending on the results. The other option is to use a fibonacci heap for these lists... It brings features that we want to this problem.. check out: http://resnet.uoregon.edu/~gurney_j/jmpc/fib.html Hmmm... even though my page says extract out of order is O(lgn), that is after an extract min does the rebalancing, and happens after many extracts have happened... (you can OO extract the first one in O(1), but a future extract will require lgn work)... Though w/ a large list the first extract min is pretty expensive (as it pays for all the O(1) inserts that haven't been extracted yet)... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not." From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 07:43:06 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 09AAF16A40F for ; Tue, 14 Nov 2006 07:43:06 +0000 (UTC) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (xorpc.icir.org [192.150.187.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id A720143D45 for ; Tue, 14 Nov 2006 07:43:05 +0000 (GMT) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (localhost [127.0.0.1]) by xorpc.icir.org (8.12.11/8.13.6) with ESMTP id kAE7h51j034526; Mon, 13 Nov 2006 23:43:05 -0800 (PST) (envelope-from rizzo@xorpc.icir.org) Received: (from rizzo@localhost) by xorpc.icir.org (8.12.11/8.12.3/Submit) id kAE7h5hF034525; Mon, 13 Nov 2006 23:43:05 -0800 (PST) (envelope-from rizzo) Date: Mon, 13 Nov 2006 23:43:05 -0800 From: Luigi Rizzo To: Poul-Henning Kamp Message-ID: <20061113234305.A34147@xorpc.icir.org> References: <20061113133236.A28926@xorpc.icir.org> <7327.1163453901@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <7327.1163453901@critter.freebsd.dk>; from phk@phk.freebsd.dk on Mon, Nov 13, 2006 at 09:38:21PM +0000 Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 07:43:06 -0000 On Mon, Nov 13, 2006 at 09:38:21PM +0000, Poul-Henning Kamp wrote: > In message <20061113133236.A28926@xorpc.icir.org>, Luigi Rizzo writes: > >On Mon, Nov 13, 2006 at 08:53:41PM +0000, Poul-Henning Kamp wrote: > >> > > >i am a bit curious on why you want to split the callouts among > >multiple data structures. > > A binary heap is optimal for the timeouts that will happen, but > filling it up with timeouts that are unlikely to, and in most > cases won't happen for a very long time will soak up CPU > time used for pointlessly ordering the heap. that's only one side - you are still paying, on each entry, the cost of the periodic scanning of the list, and the cpu burstiness of these scanning operations is harmful for system with pseudo-real-time requirements (and the workarounds complicate operations). To make a proper evaluation i would need some idea of the number and distribution of scheduled events on a busy box, and of the frequency of insertion/removals, which i don't know. But just to make an example, say you have a total of 1000 insertions/deletions per second, 64k total events. Using a single heap, that's 16k operations per second (log64k=16 is the cost of each insert/remove). Using a small 1k heap (log1k=10 is the cost of each operation), scanning the list once per second gives you 64k operations per second. plus add the heap manipulation (but maybe that's in the noise, if, say, only 10% of the insert-remove go there). Sure if you make sure the short-term list has very few entries the scanning costs go down, but how much really depends on your stats. Note, i am not opposed to separating the heaps, i fully buy your arguments on reducing the rescheduling costs and the representation of the intervals on a smaller number of bits. However, i would try to use a more efficient data structure for the long-term entries, eg. another heap with coarser (say 1s) granularity. You can insert fake events for the first 100..1000 seconds and have a small array of pointers for access pointers to those entries, so inserting/rescheduling an event within the next 100..1000 seconds window is O(1), and when the next event fires (basically in the next second) you just have to move a sublist to the main heap, without having to touch all the other entries. cheers luigi From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 07:46:01 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3F72616A40F for ; Tue, 14 Nov 2006 07:46:01 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9B76043D5C for ; Tue, 14 Nov 2006 07:46:00 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 34418170C5; Tue, 14 Nov 2006 07:45:59 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAE7jwrH009485; Tue, 14 Nov 2006 07:45:58 GMT (envelope-from phk@critter.freebsd.dk) To: John-Mark Gurney From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 13 Nov 2006 15:04:55 PST." <20061113230455.GP9291@funkthat.com> Date: Tue, 14 Nov 2006 07:45:58 +0000 Message-ID: <9484.1163490358@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 07:46:01 -0000 In message <20061113230455.GP9291@funkthat.com>, John-Mark Gurney writes: >The other option is to use a fibonacci heap for these lists... Yes, there are other candidate structures, but I really like that the binary heap can be implemented with only a pointer and an integer per node. I have no doubt that the future will bring changes to the implementation of the stuff, that's why I'm currently focusing mostly on the API so that we don't have to change that every five years. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 08:40:40 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 236A616A407 for ; Tue, 14 Nov 2006 08:40:40 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id B48E943D46 for ; Tue, 14 Nov 2006 08:40:39 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 556D9170C8; Tue, 14 Nov 2006 08:40:38 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAE8ebYT009675; Tue, 14 Nov 2006 08:40:38 GMT (envelope-from phk@critter.freebsd.dk) To: Luigi Rizzo From: "Poul-Henning Kamp" In-Reply-To: Your message of "Mon, 13 Nov 2006 23:43:05 PST." <20061113234305.A34147@xorpc.icir.org> Date: Tue, 14 Nov 2006 08:40:37 +0000 Message-ID: <9674.1163493637@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 08:40:40 -0000 In message <20061113234305.A34147@xorpc.icir.org>, Luigi Rizzo writes: >To make a proper evaluation i would need some idea of the number >and distribution of scheduled events on a busy box [...] So do I. What is important right now however, is the API. The implementation behind it we can change every week if we want, but the API affects far too many kernel files to get it wrong. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 11:56:48 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2247416A415 for ; Tue, 14 Nov 2006 11:56:48 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 879C543D46 for ; Tue, 14 Nov 2006 11:56:47 +0000 (GMT) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1Gjwtu-0002h0-5T for freebsd-arch@freebsd.org; Tue, 14 Nov 2006 12:56:26 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 14 Nov 2006 12:56:26 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 14 Nov 2006 12:56:26 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Tue, 14 Nov 2006 12:55:57 +0100 Lines: 16 Message-ID: References: <7105.1163451221@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 1.5.0.4 (X11/20060625) In-Reply-To: <7105.1163451221@critter.freebsd.dk> Sender: news Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 11:56:48 -0000 Poul-Henning Kamp wrote: ... > 4. We execute all callouts on one CPU only. ... > Short callouts, less than 2 seconds, will be stored in a binary > heap (A tree where a node is numerically lower than its parents.) > > The depth of the heap is Log2(nodes) and there are very efficient > ways to store and access the heap. > > Locking will be with one mutex for the heap. Won't that retain the 1-cpu-only "feature"? Maybe make NCPU or NCPU/2 groups and fill them round-robin? From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 12:03:47 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 0503416A412 for ; Tue, 14 Nov 2006 12:03:47 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9551743D55 for ; Tue, 14 Nov 2006 12:03:46 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 1E03D170C0; Tue, 14 Nov 2006 12:03:45 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAEC3i7R010667; Tue, 14 Nov 2006 12:03:44 GMT (envelope-from phk@critter.freebsd.dk) To: Ivan Voras From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 14 Nov 2006 12:55:57 +0100." Date: Tue, 14 Nov 2006 12:03:44 +0000 Message-ID: <10666.1163505824@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 12:03:47 -0000 In message , Ivan Voras writes: >Poul-Henning Kamp wrote: > >... >> 4. We execute all callouts on one CPU only. >... >> Short callouts, less than 2 seconds, will be stored in a binary >> heap (A tree where a node is numerically lower than its parents.) >> >> The depth of the heap is Log2(nodes) and there are very efficient >> ways to store and access the heap. >> >> Locking will be with one mutex for the heap. > >Won't that retain the 1-cpu-only "feature"? No, that will be one lock per callout-group, however we decide to use those. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 12:13:21 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 371C616A47B for ; Tue, 14 Nov 2006 12:13:21 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id BAE1943D49 for ; Tue, 14 Nov 2006 12:13:20 +0000 (GMT) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1GjxAE-0006NT-5z for freebsd-arch@freebsd.org; Tue, 14 Nov 2006 13:13:18 +0100 Received: from lara.cc.fer.hr ([161.53.72.113]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 14 Nov 2006 13:13:18 +0100 Received: from ivoras by lara.cc.fer.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Tue, 14 Nov 2006 13:13:18 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Tue, 14 Nov 2006 13:13:09 +0100 Lines: 6 Message-ID: References: <10666.1163505824@critter.freebsd.dk> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: lara.cc.fer.hr User-Agent: Thunderbird 1.5.0.4 (X11/20060625) In-Reply-To: <10666.1163505824@critter.freebsd.dk> Sender: news Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 12:13:21 -0000 Poul-Henning Kamp wrote: > No, that will be one lock per callout-group, however we decide > to use those. Ok, so there will be multiple "short-time" heaps? From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 15:08:26 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id B0B6E16A407 for ; Tue, 14 Nov 2006 15:08:26 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3ED6643D6E for ; Tue, 14 Nov 2006 15:08:14 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 8998B170C0; Tue, 14 Nov 2006 15:08:13 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAEF8D91011611; Tue, 14 Nov 2006 15:08:13 GMT (envelope-from phk@critter.freebsd.dk) To: Ivan Voras From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 14 Nov 2006 13:13:09 +0100." Date: Tue, 14 Nov 2006 15:08:13 +0000 Message-ID: <11610.1163516893@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: freebsd-arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 15:08:26 -0000 In message , Ivan Voras writes: >Poul-Henning Kamp wrote: > >> No, that will be one lock per callout-group, however we decide >> to use those. > >Ok, so there will be multiple "short-time" heaps? There _may_ be multiple groups, each consisting of a short-time heap and long-time lists. (Or however the data structures end up) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 15:12:13 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 94B2C16A40F for ; Tue, 14 Nov 2006 15:12:13 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id BF37443DAE for ; Tue, 14 Nov 2006 15:11:21 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 5512 invoked from network); 14 Nov 2006 15:04:01 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 14 Nov 2006 15:04:01 -0000 Message-ID: <4559DC98.8030103@freebsd.org> Date: Tue, 14 Nov 2006 16:11:20 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Poul-Henning Kamp , arch@freebsd.org References: <7105.1163451221@critter.freebsd.dk> <20061113230455.GP9291@funkthat.com> In-Reply-To: <20061113230455.GP9291@funkthat.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 15:12:13 -0000 John-Mark Gurney wrote: > Poul-Henning Kamp wrote this message on Mon, Nov 13, 2006 at 20:53 +0000: >> Imagine the number of lists is four, we then label them with Tnow+2s, >> Tnow+8s, Tnow+32s and "the rest". >> >> Armed callouts go into the first list that is labeled with a higher >> strike time. >> >> If a callout is rescheduled to later, it's timeout is updated, but >> it is not moved in the list. >> >> If a callout is cancled, it is removed from the list. >> >> Twice per second, the first list is scanned, and any due callouts >> called and any callouts later than the lists strike time is moved >> into the right list. >> >> When "Tnow+2s" rolls around, the lists are rotated to the left: >> the Tnow+8s becomes "Tnow+2s" etc. >> >> The idea behind this (untried) scheme for the long callouts, is to >> distribute the callouts somewhat evenly in the lists, while maintaining >> only the relevant entries in the first list, the point being that >> most of them (TCP keepalive, CAM etc) will never happen, so spending >> time sorting them more than necessary is pointless. Obviously, >> this algorithm needs to be tested in practice and tuned/changed/discared >> depending on the results. > > The other option is to use a fibonacci heap for these lists... It > brings features that we want to this problem.. check out: > http://resnet.uoregon.edu/~gurney_j/jmpc/fib.html > > Hmmm... even though my page says extract out of order is O(lgn), that > is after an extract min does the rebalancing, and happens after many > extracts have happened... (you can OO extract the first one in O(1), > but a future extract will require lgn work)... > > Though w/ a large list the first extract min is pretty expensive (as > it pays for all the O(1) inserts that haven't been extracted yet)... It's important to know that any random memory accesses on modern CPUs are really expensive because of cache misses. That's why Judy tries beat RB tries by an order of a magnitude these days. -- Andre From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 15:27:01 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3173B16A4ED for ; Tue, 14 Nov 2006 15:27:01 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 6159F43E0E for ; Tue, 14 Nov 2006 15:25:58 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 5669 invoked from network); 14 Nov 2006 15:18:38 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 14 Nov 2006 15:18:38 -0000 Message-ID: <4559E004.6050204@freebsd.org> Date: Tue, 14 Nov 2006 16:25:56 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Poul-Henning Kamp References: <7105.1163451221@critter.freebsd.dk> In-Reply-To: <7105.1163451221@critter.freebsd.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 15:27:01 -0000 Poul-Henning Kamp wrote: > The idea behind this (untried) scheme for the long callouts, is to > distribute the callouts somewhat evenly in the lists, while maintaining > only the relevant entries in the first list, the point being that > most of them (TCP keepalive, CAM etc) will never happen, so spending > time sorting them more than necessary is pointless. Obviously, > this algorithm needs to be tested in practice and tuned/changed/discared > depending on the results. TCP maintains a number of timers per connection of which most never ever fire. I have working code that manages the timers within TCP and registers only the next upcoming with the callout mechanism. This avoids a lot of pointless churn in the callout wheels. Nonetheless one callout is disabled/ rearmed per packet sent/received. A majority of those again will never fire and instead rearm on the next packet. A non-profiled guess makes TCP the #1 churner in the whole callout mechanism. Firing TCP callouts may do significant work which could easily be run in parallel for any number of TCP sessions. For TCP we may want to insert an abstraction where the callout mechanism kicks a taskqueue instead of running the callout itself. There we may have a number (=ncpu) of worker threads services this taskqueue. In TCP we also may hit a number of cases where the callout fires at the time a packet for that connection comes in. If the callout finds the tcpcb already locked it simply may discard this callout right away instead of (busy) waiting as all timer stuff and rearming will already happen while it is locked. However keep in mind these are all educated guesses but not hard facts from profiling on real machines. -- Andre From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 15:27:42 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 2A4DF16A403; Tue, 14 Nov 2006 15:27:42 +0000 (UTC) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (xorpc.icir.org [192.150.187.68]) by mx1.FreeBSD.org (Postfix) with ESMTP id 02FA943D58; Tue, 14 Nov 2006 15:27:11 +0000 (GMT) (envelope-from rizzo@icir.org) Received: from xorpc.icir.org (localhost [127.0.0.1]) by xorpc.icir.org (8.12.11/8.13.6) with ESMTP id kAEFR3fq041022; Tue, 14 Nov 2006 07:27:03 -0800 (PST) (envelope-from rizzo@xorpc.icir.org) Received: (from rizzo@localhost) by xorpc.icir.org (8.12.11/8.12.3/Submit) id kAEFR334041021; Tue, 14 Nov 2006 07:27:03 -0800 (PST) (envelope-from rizzo) Date: Tue, 14 Nov 2006 07:27:03 -0800 From: Luigi Rizzo To: Andre Oppermann Message-ID: <20061114072703.A40467@xorpc.icir.org> References: <7105.1163451221@critter.freebsd.dk> <20061113230455.GP9291@funkthat.com> <4559DC98.8030103@freebsd.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.2.5.1i In-Reply-To: <4559DC98.8030103@freebsd.org>; from andre@freebsd.org on Tue, Nov 14, 2006 at 04:11:20PM +0100 Cc: arch@freebsd.org, Poul-Henning Kamp Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 15:27:42 -0000 On Tue, Nov 14, 2006 at 04:11:20PM +0100, Andre Oppermann wrote: ... > It's important to know that any random memory accesses on modern > CPUs are really expensive because of cache misses. That's why > Judy tries beat RB tries by an order of a magnitude these days. you mean this stuff ? http://docs.hp.com/en/B6841-90001/ch02s01.html http://judy.sourceforge.net/ cheers luigi From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 15:38:51 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7B82A16A47B for ; Tue, 14 Nov 2006 15:38:51 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 32EA643D9C for ; Tue, 14 Nov 2006 15:38:42 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 5822 invoked from network); 14 Nov 2006 15:31:22 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 14 Nov 2006 15:31:22 -0000 Message-ID: <4559E301.2030607@freebsd.org> Date: Tue, 14 Nov 2006 16:38:41 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Luigi Rizzo References: <7105.1163451221@critter.freebsd.dk> <20061113230455.GP9291@funkthat.com> <4559DC98.8030103@freebsd.org> <20061114072703.A40467@xorpc.icir.org> In-Reply-To: <20061114072703.A40467@xorpc.icir.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org, Poul-Henning Kamp Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 15:38:51 -0000 Luigi Rizzo wrote: > On Tue, Nov 14, 2006 at 04:11:20PM +0100, Andre Oppermann wrote: > ... >> It's important to know that any random memory accesses on modern >> CPUs are really expensive because of cache misses. That's why >> Judy tries beat RB tries by an order of a magnitude these days. > > you mean this stuff ? > > http://docs.hp.com/en/B6841-90001/ch02s01.html > http://judy.sourceforge.net/ We've used it a number of other projects and it beats everything else hands down in speed and memory consumption. -- Andre From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 15:46:19 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 974A616A5C0; Tue, 14 Nov 2006 15:46:19 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 303FD43D55; Tue, 14 Nov 2006 15:46:14 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id B3B61170C0; Tue, 14 Nov 2006 15:46:12 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAEFkCc7011833; Tue, 14 Nov 2006 15:46:12 GMT (envelope-from phk@critter.freebsd.dk) To: Andre Oppermann From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 14 Nov 2006 16:38:41 +0100." <4559E301.2030607@freebsd.org> Date: Tue, 14 Nov 2006 15:46:11 +0000 Message-ID: <11832.1163519171@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 15:46:19 -0000 In message <4559E301.2030607@freebsd.org>, Andre Oppermann writes: >Luigi Rizzo wrote: >> On Tue, Nov 14, 2006 at 04:11:20PM +0100, Andre Oppermann wrote: >> ... >>> It's important to know that any random memory accesses on modern >>> CPUs are really expensive because of cache misses. That's why >>> Judy tries beat RB tries by an order of a magnitude these days. >> >> you mean this stuff ? >> >> http://docs.hp.com/en/B6841-90001/ch02s01.html >> http://judy.sourceforge.net/ > >We've used it a number of other projects and it beats everything >else hands down in speed and memory consumption. I would like to thank you all for your enthusiasm in promoting various data structures, but I kindly remind you that the only sorting requirement we have for the short/likely callouts is to know which one is next and that we may have duplicate keys. We are never going to search for a callout that expires in 1402 microseconds or anything of the sort, the basic operations are: add node to tree delete node from tree move node in tree (rescheduling) find earliest callout All of these are welldefined and efficient in binary heaps, which additionally have the advantage of being very space efficient. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 16:08:32 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4F48316A4C9 for ; Tue, 14 Nov 2006 16:08:32 +0000 (UTC) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1CB4043F63 for ; Tue, 14 Nov 2006 16:00:03 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 6046 invoked from network); 14 Nov 2006 15:52:28 -0000 Received: from c00l3r.networx.ch (HELO [127.0.0.1]) ([62.48.2.2]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 14 Nov 2006 15:52:28 -0000 Message-ID: <4559E7F4.4000603@freebsd.org> Date: Tue, 14 Nov 2006 16:59:48 +0100 From: Andre Oppermann User-Agent: Thunderbird 1.5.0.8 (Windows/20061025) MIME-Version: 1.0 To: Poul-Henning Kamp References: <11832.1163519171@critter.freebsd.dk> In-Reply-To: <11832.1163519171@critter.freebsd.dk> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 16:08:32 -0000 Poul-Henning Kamp wrote: > In message <4559E301.2030607@freebsd.org>, Andre Oppermann writes: >> Luigi Rizzo wrote: >>> On Tue, Nov 14, 2006 at 04:11:20PM +0100, Andre Oppermann wrote: >>> ... >>>> It's important to know that any random memory accesses on modern >>>> CPUs are really expensive because of cache misses. That's why >>>> Judy tries beat RB tries by an order of a magnitude these days. >>> you mean this stuff ? >>> >>> http://docs.hp.com/en/B6841-90001/ch02s01.html >>> http://judy.sourceforge.net/ >> We've used it a number of other projects and it beats everything >> else hands down in speed and memory consumption. > > I would like to thank you all for your enthusiasm in promoting > various data structures, but I kindly remind you that the only > sorting requirement we have for the short/likely callouts is > to know which one is next and that we may have duplicate keys. Heh. I never meant to propose any particular data structure for the callout stuff. Judy and RB were purely meant to illustrate the (non-)cache busting effect. I certainly wouldn't want to include Judy in the kernel. -- Andre From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 17:23:23 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 3E64016A40F; Tue, 14 Nov 2006 17:23:23 +0000 (UTC) (envelope-from jdp@polstra.com) Received: from blake.polstra.com (blake.polstra.com [64.81.189.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id 1E7C443D49; Tue, 14 Nov 2006 17:23:20 +0000 (GMT) (envelope-from jdp@polstra.com) Received: from strings.polstra.com (strings.polstra.com [64.81.189.67]) by blake.polstra.com (8.13.6/8.13.6) with ESMTP id kAEHN8C8071836; Tue, 14 Nov 2006 09:23:08 -0800 (PST) (envelope-from jdp@polstra.com) Message-ID: X-Mailer: XFMail 1.5.5 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <11832.1163519171@critter.freebsd.dk> Date: Tue, 14 Nov 2006 09:23:08 -0800 (PST) From: John Polstra To: Andre Oppermann Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 17:23:23 -0000 In message <4559E301.2030607@freebsd.org>, Andre Oppermann writes: >Luigi Rizzo wrote: >> On Tue, Nov 14, 2006 at 04:11:20PM +0100, Andre Oppermann wrote: >> ... >>> It's important to know that any random memory accesses on modern >>> CPUs are really expensive because of cache misses. That's why >>> Judy tries beat RB tries by an order of a magnitude these days. >> >> you mean this stuff ? >> >> http://docs.hp.com/en/B6841-90001/ch02s01.html >> http://judy.sourceforge.net/ > >We've used it a number of other projects and it beats everything >else hands down in speed and memory consumption. Very cool. Unfortunately, Appendix A of the H-P document says, "Hewlett-Packard has patents pending on the Judy Technology," so the data structure might not be useful to us in practice. John From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 17:26:50 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4949316A415 for ; Tue, 14 Nov 2006 17:26:50 +0000 (UTC) (envelope-from jdp@polstra.com) Received: from blake.polstra.com (blake.polstra.com [64.81.189.66]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2A2F243D79 for ; Tue, 14 Nov 2006 17:26:46 +0000 (GMT) (envelope-from jdp@polstra.com) Received: from strings.polstra.com (strings.polstra.com [64.81.189.67]) by blake.polstra.com (8.13.6/8.13.6) with ESMTP id kAEHQj8i071917; Tue, 14 Nov 2006 09:26:45 -0800 (PST) (envelope-from jdp@polstra.com) Message-ID: X-Mailer: XFMail 1.5.5 on FreeBSD X-Priority: 3 (Normal) Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 8bit MIME-Version: 1.0 In-Reply-To: <7105.1163451221@critter.freebsd.dk> Date: Tue, 14 Nov 2006 09:26:45 -0800 (PST) From: John Polstra To: Poul-Henning Kamp Cc: arch@freebsd.org Subject: RE: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 17:26:50 -0000 On 13-Nov-2006 Poul-Henning Kamp wrote: > > A number of problems have been identified with our current callout > code and I have been thinking about and discussed various aspects > with people during the EuroBSDcon2007 conference. > > A lot of people are interested in this, so here is a quick sketch > of what I'm thinking about: > > > The Problems > ------------ > > 1. We need better resolution than a periodic "hz" clock can give us. > Highspeed networking, gaming servers and other real-time apps want > this. > > 2. We "pollute" our call-wheel with tons of callouts that we know are > unlikely to happen. > > 3. We have many operations on the callout wheel because certain > callouts gets rearmed for later in the future. (TCP keepalives). > > 4. We execute all callouts on one CPU only. > > 5. Most of the specified timeouts are bogus, because of the imprecision > inheret in the current 1/hz method of scheduling them. > > and a number of other issues. > > > The proposed API > ---------------- I like the proposed API. FWIW, the problems you listed that are most important to me are #4 and #1. Inexpensive automatic rearming for periodic timeouts is also important. John From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 18:29:21 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 14FF516A403 for ; Tue, 14 Nov 2006 18:29:21 +0000 (UTC) (envelope-from rnsanchez@wait4.org) Received: from spunkymail-a15.dreamhost.com (sd-green-bigip-74.dreamhost.com [208.97.132.74]) by mx1.FreeBSD.org (Postfix) with ESMTP id 2691043D7B for ; Tue, 14 Nov 2006 18:29:11 +0000 (GMT) (envelope-from rnsanchez@wait4.org) Received: from sauron.lan.box (unknown [200.180.164.105]) by spunkymail-a15.dreamhost.com (Postfix) with ESMTP id 356C77F068; Tue, 14 Nov 2006 10:28:13 -0800 (PST) Date: Tue, 14 Nov 2006 16:26:58 -0200 From: Ricardo Nabinger Sanchez To: Poul-Henning Kamp Message-Id: <20061114162658.ae168dcf.rnsanchez@wait4.org> In-Reply-To: <7105.1163451221@critter.freebsd.dk> References: <7105.1163451221@critter.freebsd.dk> Organization: SYS_WAIT4 X-Mailer: Sylpheed version 2.3.0beta2 (GTK+ 2.10.6; i386-portbld-freebsd6.1) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 18:29:21 -0000 On Mon, 13 Nov 2006 20:53:41 +0000 Poul-Henning Kamp wrote: > XXX_disarm(struct xxx*) > Unarm the timer. > > XXX_drain(struct xxx*) > Drain the timer. One of these (or both) removes a callout from the tree? I don't quite understand their differences. -- Ricardo Nabinger Sanchez Powered by FreeBSD "Left to themselves, things tend to go from bad to worse." From owner-freebsd-arch@FreeBSD.ORG Tue Nov 14 18:31:52 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8736516A47E for ; Tue, 14 Nov 2006 18:31:52 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 9E3E743D5E for ; Tue, 14 Nov 2006 18:31:47 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 22638170C0; Tue, 14 Nov 2006 18:31:46 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAEIVjum012466; Tue, 14 Nov 2006 18:31:45 GMT (envelope-from phk@critter.freebsd.dk) To: Ricardo Nabinger Sanchez From: "Poul-Henning Kamp" In-Reply-To: Your message of "Tue, 14 Nov 2006 16:26:58 -0200." <20061114162658.ae168dcf.rnsanchez@wait4.org> Date: Tue, 14 Nov 2006 18:31:45 +0000 Message-ID: <12465.1163529105@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 14 Nov 2006 18:31:52 -0000 In message <20061114162658.ae168dcf.rnsanchez@wait4.org>, Ricardo Nabinger Sanc hez writes: >On Mon, 13 Nov 2006 20:53:41 +0000 >Poul-Henning Kamp wrote: > >> XXX_disarm(struct xxx*) >> Unarm the timer. >> >> XXX_drain(struct xxx*) >> Drain the timer. > >One of these (or both) removes a callout from the tree? I don't quite >understand their differences. disarm disables the callout, it will not be called again after the disarm call returns, and it will not be rescheduled if it is currently running. drain does not return until the callout has completed running, if it is currently active. The two-phase teardown is necessary where a mutex is held during the callout execution. -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Wed Nov 15 08:40:30 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8F86916A403 for ; Wed, 15 Nov 2006 08:40:30 +0000 (UTC) (envelope-from peterjeremy@optushome.com.au) Received: from mail28.syd.optusnet.com.au (mail28.syd.optusnet.com.au [211.29.133.169]) by mx1.FreeBSD.org (Postfix) with ESMTP id E216943D45 for ; Wed, 15 Nov 2006 08:40:29 +0000 (GMT) (envelope-from peterjeremy@optushome.com.au) Received: from turion.vk2pj.dyndns.org (c58-107-94-118.belrs4.nsw.optusnet.com.au [58.107.94.118]) by mail28.syd.optusnet.com.au (8.13.1/8.13.1) with ESMTP id kAF8eQjW027520 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 15 Nov 2006 19:40:27 +1100 Received: from turion.vk2pj.dyndns.org (localhost.vk2pj.dyndns.org [127.0.0.1]) by turion.vk2pj.dyndns.org (8.13.8/8.13.8) with ESMTP id kAF8eQFM001415; Wed, 15 Nov 2006 19:40:26 +1100 (EST) (envelope-from peter@turion.vk2pj.dyndns.org) Received: (from peter@localhost) by turion.vk2pj.dyndns.org (8.13.8/8.13.8/Submit) id kAF8ePGX001414; Wed, 15 Nov 2006 19:40:25 +1100 (EST) (envelope-from peter) Date: Wed, 15 Nov 2006 19:40:25 +1100 From: Peter Jeremy To: Poul-Henning Kamp Message-ID: <20061115084025.GA914@turion.vk2pj.dyndns.org> References: <20061113234305.A34147@xorpc.icir.org> <9674.1163493637@critter.freebsd.dk> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="opJtzjQTFsWo+cga" Content-Disposition: inline In-Reply-To: <9674.1163493637@critter.freebsd.dk> X-PGP-Key: http://members.optusnet.com.au/peterjeremy/pubkey.asc User-Agent: Mutt/1.5.13 (2006-08-11) Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Nov 2006 08:40:30 -0000 --opJtzjQTFsWo+cga Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Mon, 2006-Nov-13 21:38:21 +0000, Poul-Henning Kamp wrote: >The other thing is that covering the entire range from hour long >callouts to nanosecond callouts would require a 64 bit value or >a tricky pseudo-FP encoding. By splitting them in two classes, >I can use two different 31 bit encodings separated by the top bit. This sounds like a pseudo-FP encoding with a very small exponent and a relatively large mantissa :-) On Tue, 2006-Nov-14 08:40:37 +0000, Poul-Henning Kamp wrote: >In message <20061113234305.A34147@xorpc.icir.org>, Luigi Rizzo writes: > >>To make a proper evaluation i would need some idea of the number >>and distribution of scheduled events on a busy box [...] > >So do I. > >What is important right now however, is the API. The implementation >behind it we can change every week if we want, but the API affects >far too many kernel files to get it wrong. I don't see anything obviously wrong with the API as proposed. The use of an opaque tick_t rather than requiring scaling on each call seems an obvious improvement. That said, is it worthwhile to instrument the existing callout code to get some statistics on what actions are frequently used - that might suggest operations that need to be cheap. (Though identifying high-level actions like repeat/arm/disarm might be difficult). --=20 Peter Jeremy --opJtzjQTFsWo+cga Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFWtJ5/opHv/APuIcRAou2AJ45R8IvuFjjGBkwwgnQFoo08Pj7gACfdm8O kbaJkiB5tC1lpJzn15USdkc= =+XtB -----END PGP SIGNATURE----- --opJtzjQTFsWo+cga-- From owner-freebsd-arch@FreeBSD.ORG Wed Nov 15 09:04:19 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 31B7E16A416 for ; Wed, 15 Nov 2006 09:04:19 +0000 (UTC) (envelope-from phk@critter.freebsd.dk) Received: from phk.freebsd.dk (phk.freebsd.dk [130.225.244.222]) by mx1.FreeBSD.org (Postfix) with ESMTP id 44B0643D4C for ; Wed, 15 Nov 2006 09:04:16 +0000 (GMT) (envelope-from phk@critter.freebsd.dk) Received: from critter.freebsd.dk (critter.freebsd.dk [192.168.48.2]) by phk.freebsd.dk (Postfix) with ESMTP id 75078170C1; Wed, 15 Nov 2006 09:04:14 +0000 (UTC) Received: from critter.freebsd.dk (localhost [127.0.0.1]) by critter.freebsd.dk (8.13.8/8.13.8) with ESMTP id kAF94CWC015432; Wed, 15 Nov 2006 09:04:13 GMT (envelope-from phk@critter.freebsd.dk) To: Peter Jeremy From: "Poul-Henning Kamp" In-Reply-To: Your message of "Wed, 15 Nov 2006 19:40:25 +1100." <20061115084025.GA914@turion.vk2pj.dyndns.org> Date: Wed, 15 Nov 2006 09:04:12 +0000 Message-ID: <15431.1163581452@critter.freebsd.dk> Sender: phk@critter.freebsd.dk Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Nov 2006 09:04:19 -0000 In message <20061115084025.GA914@turion.vk2pj.dyndns.org>, Peter Jeremy writes: >On Mon, 2006-Nov-13 21:38:21 +0000, Poul-Henning Kamp wrote: >>The other thing is that covering the entire range from hour long >>callouts to nanosecond callouts would require a 64 bit value or >>a tricky pseudo-FP encoding. By splitting them in two classes, >>I can use two different 31 bit encodings separated by the top bit. > >This sounds like a pseudo-FP encoding with a very small exponent and a >relatively large mantissa :-) Well, yes... :-) -- Poul-Henning Kamp | UNIX since Zilog Zeus 3.20 phk@FreeBSD.ORG | TCP/IP since RFC 956 FreeBSD committer | BSD since 4.3-tahoe Never attribute to malice what can adequately be explained by incompetence. From owner-freebsd-arch@FreeBSD.ORG Wed Nov 15 13:21:56 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 8007D16A40F for ; Wed, 15 Nov 2006 13:21:56 +0000 (UTC) (envelope-from gnn@neville-neil.com) Received: from mrout1.yahoo.com (mrout1.yahoo.com [216.145.54.171]) by mx1.FreeBSD.org (Postfix) with ESMTP id F31EE43D64 for ; Wed, 15 Nov 2006 13:21:55 +0000 (GMT) (envelope-from gnn@neville-neil.com) Received: from minion.local.neville-neil.com (proxy8.corp.yahoo.com [216.145.48.13]) by mrout1.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id kAFDLVdf098757; Wed, 15 Nov 2006 05:21:32 -0800 (PST) Date: Wed, 15 Nov 2006 22:21:28 +0900 Message-ID: From: gnn@freebsd.org To: Poul-Henning Kamp In-Reply-To: <7105.1163451221@critter.freebsd.dk> References: <7105.1163451221@critter.freebsd.dk> User-Agent: Wanderlust/2.14.0 (Africa) SEMI/1.14.6 (Maruoka) FLIM/1.14.8 (=?ISO-8859-4?Q?Shij=F2?=) APEL/10.6 Emacs/22.0.90 (i386-apple-darwin8.8.1) MULE/5.0 (SAKAKI) MIME-Version: 1.0 (generated by SEMI 1.14.6 - "Maruoka") Content-Type: text/plain; charset=US-ASCII Cc: arch@freebsd.org Subject: Re: a proposed callout API X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 15 Nov 2006 13:21:56 -0000 At Mon, 13 Nov 2006 20:53:41 +0000, Poul-Henning Kamp wrote: > > > A number of problems have been identified with our current callout > code and I have been thinking about and discussed various aspects > with people during the EuroBSDcon2007 conference. > > A lot of people are interested in this, so here is a quick sketch > of what I'm thinking about: > I think this makes sense and I dont' have a problem with the proposed APIs. Best, George From owner-freebsd-arch@FreeBSD.ORG Fri Nov 17 00:48:58 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id EEC0A16A40F for ; Fri, 17 Nov 2006 00:48:58 +0000 (UTC) (envelope-from jessicah@juniper.net) Received: from colo-dns-ext1.juniper.net (colo-dns-ext1.juniper.net [207.17.137.57]) by mx1.FreeBSD.org (Postfix) with ESMTP id 89F5F43D67 for ; Fri, 17 Nov 2006 00:48:58 +0000 (GMT) (envelope-from jessicah@juniper.net) Received: from magenta.juniper.net (magenta.juniper.net [172.17.28.122]) by colo-dns-ext1.juniper.net (8.11.3/8.9.3) with ESMTP id kAH0lOX66071; Thu, 16 Nov 2006 16:47:24 -0800 (PST) (envelope-from jessicah@juniper.net) Received: from [172.17.13.28] (jessicah-lnx.juniper.net [172.17.13.28]) by magenta.juniper.net (8.11.3/8.11.3) with ESMTP id kAH0l7E26891; Thu, 16 Nov 2006 16:47:07 -0800 (PST) (envelope-from jessicah@juniper.net) Message-ID: <455D068A.2090503@juniper.net> Date: Thu, 16 Nov 2006 16:47:06 -0800 From: Jessica Han User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: freebsd-arch@freebsd.org, marcel@xcllnt.net Content-Type: multipart/mixed; boundary="------------010609060404070502070607" Cc: Subject: A Patch for kgdb X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Nov 2006 00:48:59 -0000 This is a multi-part message in MIME format. --------------010609060404070502070607 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit I got an infinite loop while trying to debug a kernel core on FreeBSD 6.1. # uname -sr FreeBSD 6.1-RELEASE #kgdb kernel vmcore.0 kgdb: kvm_read: invalid address (0x50012) kgdb: kvm_read: invalid address (0x7) kgdb: kvm_read: invalid address (0xf5c) kgdb: kvm_read: invalid address (0xf5c) kgdb: kvm_read: invalid address (0xf5c) The attached patch fixed it, can somebody review it for me and commit it if it is okay? Thanks, Jessica jessicah@juniper.net --------------010609060404070502070607 Content-Type: text/x-patch; name="kthr.c.patch" Content-Transfer-Encoding: 7bit Content-Disposition: inline; filename="kthr.c.patch" --- /6.1-vanila/src/gnu/usr.bin/gdb/kgdb/kthr.c Wed Sep 14 22:32:10 2005 +++ gnu/usr.bin/gdb/kgdb/kthr.c Thu Nov 16 12:37:41 2006 @@ -92,12 +92,16 @@ dumptid = -1; while (paddr != 0) { - if (kvm_read(kvm, paddr, &p, sizeof(p)) != sizeof(p)) + if (kvm_read(kvm, paddr, &p, sizeof(p)) != sizeof(p)) { warnx("kvm_read: %s", kvm_geterr(kvm)); + break; + } addr = (uintptr_t)TAILQ_FIRST(&p.p_threads); while (addr != 0) { - if (kvm_read(kvm, addr, &td, sizeof(td)) != sizeof(td)) + if (kvm_read(kvm, addr, &td, sizeof(td)) != sizeof(td)) { warnx("kvm_read: %s", kvm_geterr(kvm)); + break; + } kt = malloc(sizeof(*kt)); kt->next = first; kt->kaddr = addr; --------------010609060404070502070607-- From owner-freebsd-arch@FreeBSD.ORG Fri Nov 17 00:50:39 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 607A816A49E for ; Fri, 17 Nov 2006 00:50:39 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id 3573D43D5C for ; Fri, 17 Nov 2006 00:50:38 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 2003A1A3C1E; Thu, 16 Nov 2006 16:50:38 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id EB07E51316; Thu, 16 Nov 2006 19:50:25 -0500 (EST) Date: Thu, 16 Nov 2006 19:50:25 -0500 From: Kris Kennaway To: Jessica Han Message-ID: <20061117005025.GA72346@xor.obsecurity.org> References: <455D068A.2090503@juniper.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="0F1p//8PRICkK4MW" Content-Disposition: inline In-Reply-To: <455D068A.2090503@juniper.net> User-Agent: Mutt/1.4.2.2i Cc: marcel@xcllnt.net, freebsd-arch@freebsd.org Subject: Re: A Patch for kgdb X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Nov 2006 00:50:39 -0000 --0F1p//8PRICkK4MW Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 16, 2006 at 04:47:06PM -0800, Jessica Han wrote: > I got an infinite loop while trying to debug a kernel core on FreeBSD 6.1. > # uname -sr > FreeBSD 6.1-RELEASE > #kgdb kernel vmcore.0 > kgdb: kvm_read: invalid address (0x50012) > kgdb: kvm_read: invalid address (0x7) > kgdb: kvm_read: invalid address (0xf5c) > kgdb: kvm_read: invalid address (0xf5c) > kgdb: kvm_read: invalid address (0xf5c) >=20 > The attached patch fixed it, can somebody review it for me and commit it= =20 > if it is okay? Thanks, This means that the kernel you're running against didn't match the core - were you actually able to get a valid trace out of it? Kris --0F1p//8PRICkK4MW Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFXQdRWry0BWjoQKURAh6tAJ4lpUvwKkqPpI6x6XgDJRTZaswSNQCfba7t WdUlQq82knWN2thWnNnh4uM= =uuQO -----END PGP SIGNATURE----- --0F1p//8PRICkK4MW-- From owner-freebsd-arch@FreeBSD.ORG Fri Nov 17 01:02:56 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 7F35B16A492 for ; Fri, 17 Nov 2006 01:02:56 +0000 (UTC) (envelope-from jessicah@juniper.net) Received: from colo-dns-ext2.juniper.net (colo-dns-ext2.juniper.net [207.17.137.64]) by mx1.FreeBSD.org (Postfix) with ESMTP id 03BDD43D5E for ; Fri, 17 Nov 2006 01:02:54 +0000 (GMT) (envelope-from jessicah@juniper.net) Received: from magenta.juniper.net (magenta.juniper.net [172.17.28.122]) by colo-dns-ext2.juniper.net (8.12.3/8.12.3) with ESMTP id kAH12s1Z048212; Thu, 16 Nov 2006 17:02:54 -0800 (PST) (envelope-from jessicah@juniper.net) Received: from [172.17.13.28] (jessicah-lnx.juniper.net [172.17.13.28]) by magenta.juniper.net (8.11.3/8.11.3) with ESMTP id kAH12sE27194; Thu, 16 Nov 2006 17:02:54 -0800 (PST) (envelope-from jessicah@juniper.net) Message-ID: <455D0A3E.3020501@juniper.net> Date: Thu, 16 Nov 2006 17:02:54 -0800 From: Jessica Han User-Agent: Thunderbird 1.5.0.8 (X11/20061025) MIME-Version: 1.0 To: Kris Kennaway References: <455D068A.2090503@juniper.net> <20061117005025.GA72346@xor.obsecurity.org> In-Reply-To: <20061117005025.GA72346@xor.obsecurity.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: marcel@xcllnt.net, freebsd-arch@freebsd.org Subject: Re: A Patch for kgdb X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Nov 2006 01:02:56 -0000 Kris, You are right, they didn't match, I was using a wrong kernel image when I got the error. I wasn't able to get the back trace info. Jessica Kris Kennaway wrote: > On Thu, Nov 16, 2006 at 04:47:06PM -0800, Jessica Han wrote: > >> I got an infinite loop while trying to debug a kernel core on FreeBSD 6.1. >> # uname -sr >> FreeBSD 6.1-RELEASE >> #kgdb kernel vmcore.0 >> kgdb: kvm_read: invalid address (0x50012) >> kgdb: kvm_read: invalid address (0x7) >> kgdb: kvm_read: invalid address (0xf5c) >> kgdb: kvm_read: invalid address (0xf5c) >> kgdb: kvm_read: invalid address (0xf5c) >> >> The attached patch fixed it, can somebody review it for me and commit it >> if it is okay? Thanks, >> > > This means that the kernel you're running against didn't match the > core - were you actually able to get a valid trace out of it? > > Kris > From owner-freebsd-arch@FreeBSD.ORG Fri Nov 17 01:04:51 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 4E72516A415 for ; Fri, 17 Nov 2006 01:04:51 +0000 (UTC) (envelope-from kris@obsecurity.org) Received: from elvis.mu.org (elvis.mu.org [192.203.228.196]) by mx1.FreeBSD.org (Postfix) with ESMTP id AB0D643D53 for ; Fri, 17 Nov 2006 01:04:50 +0000 (GMT) (envelope-from kris@obsecurity.org) Received: from obsecurity.dyndns.org (elvis.mu.org [192.203.228.196]) by elvis.mu.org (Postfix) with ESMTP id 9753E1A3C19; Thu, 16 Nov 2006 17:04:50 -0800 (PST) Received: by obsecurity.dyndns.org (Postfix, from userid 1000) id 7F93651361; Thu, 16 Nov 2006 20:04:38 -0500 (EST) Date: Thu, 16 Nov 2006 20:04:38 -0500 From: Kris Kennaway To: Jessica Han Message-ID: <20061117010438.GA72540@xor.obsecurity.org> References: <455D068A.2090503@juniper.net> <20061117005025.GA72346@xor.obsecurity.org> <455D0A3E.3020501@juniper.net> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="pf9I7BMVVzbSWLtt" Content-Disposition: inline In-Reply-To: <455D0A3E.3020501@juniper.net> User-Agent: Mutt/1.4.2.2i Cc: freebsd-arch@freebsd.org, marcel@xcllnt.net, Kris Kennaway Subject: Re: A Patch for kgdb X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Nov 2006 01:04:51 -0000 --pf9I7BMVVzbSWLtt Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Thu, Nov 16, 2006 at 05:02:54PM -0800, Jessica Han wrote: > Kris, > You are right, they didn't match, I was using a wrong kernel image when= =20 > I got the error. I wasn't able to get the back trace info. OK, fixing the infinite loop is still a good idea. Kris --pf9I7BMVVzbSWLtt Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFXQqmWry0BWjoQKURAi0eAJ9dYclCouEzw16LJjjSY6oSEz0sSQCg20OD 8LLcWIVR4+K1GbVDp3Qig38= =gz11 -----END PGP SIGNATURE----- --pf9I7BMVVzbSWLtt-- From owner-freebsd-arch@FreeBSD.ORG Fri Nov 17 10:38:12 2006 Return-Path: X-Original-To: freebsd-arch@FreeBSD.org Delivered-To: freebsd-arch@FreeBSD.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 6E27216A412 for ; Fri, 17 Nov 2006 10:38:12 +0000 (UTC) (envelope-from Offers@LetsGoHoliday.com) Received: from krypton2.melitacable.com (smtp.onvol.net [212.56.128.131]) by mx1.FreeBSD.org (Postfix) with ESMTP id C9D8043D6A for ; Fri, 17 Nov 2006 10:38:11 +0000 (GMT) (envelope-from Offers@LetsGoHoliday.com) Received: from smtp.arrigogroup.com.mt ([212.56.143.66]) by krypton2.melitacable.com (8.13.6/8.13.6) with SMTP id kAHAf2o7078330 for ; Fri, 17 Nov 2006 11:41:02 +0100 (CET) Organization: LetsGoHoliday.com Message-ID: <491db1a9100696176a784383001ad47d@letsgoholiday.com> From: "LetsGoHoliday.com" To: Date: Fri, 17 Nov 2006 07:21:19 +0100 MIME-Version: 1.0 Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Cc: Subject: New Year Abroad from Lm99 - LetsGoHoliday.com X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list Reply-To: Offers@LetsGoHoliday.com List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 17 Nov 2006 10:38:12 -0000 BOOK ONLINE NOW and SAVE 10% on World-Wide Hotels! Go to: http://www.letsgohotels.com=20 ------------------------------------------------------------------- CELEBRATE THE NEW YEAR IN BUDAPEST! - 27 December - 01 January 2007 Package from: Lm 99.00 - BOOK NOW... SEATS ARE LIMITED!!! http://www.letsgoholiday.com/holidays/budapest/index.asp ------------------------------------------------------------------- SKIING IN BANKSO (BULGARIA) 22 - 29 January & 08 - 15 March From: Lm 194.00 per person in triple sharing http://www.letsgoholiday.com/holidays/bansko/group/ SKIING IN KRANJSKA GORA (SLOVENIA) 02 - 09 January >From : Lm 233.00 per person in triple sharing http://www.letsgoholiday.com/holidays/kranjska/groups/ Other individual ski packages also available=2E ------------------------------------------------------------------- LETSGOHOLIDAY.COM http://www.letsgoholiday.com - Your one stop travel shop E-mail: info@letsgoholiday.com Call Center: +356 23492204/5 Address: ATV Travel, 250 Tower Rd, Sliema SLM05 MALTA ------------------------------------------------------------------- Kindly inform us should you not, or no longer, wish to receive any = marketing or promotional information from LetsGoHoliday.com. You have the = right to require that LetsGoHoliday.com provide you with access to your = personal data as well as the right to rectify, or, in appropriate = circumstances, erase any inaccurate, incomplete or immaterial personal = data which is being processed. However, kindly inform LetsGoHoliday.com of = any alterations relating to your personal data which is being processed. = LetsGoHoliday.com undertakes to implement appropriate measures and = safeguards for the purpose of protecting the confidentiality, integrity = and availability of all data processed=2E ------------------------------------------------------------------- To unsubscribe from our mailing list, please click here and enter your e-mail: http://www.letsgoholiday.com/subscribe From owner-freebsd-arch@FreeBSD.ORG Sat Nov 18 07:52:36 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 63E0916B2D1 for ; Sat, 18 Nov 2006 07:52:36 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7A44543F3E for ; Sat, 18 Nov 2006 04:05:01 +0000 (GMT) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1GlChO-0002Id-O8 for freebsd-arch@freebsd.org; Sat, 18 Nov 2006 00:00:43 +0100 Received: from 89-172-60-192.adsl.net.t-com.hr ([89.172.60.192]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 18 Nov 2006 00:00:42 +0100 Received: from ivoras by 89-172-60-192.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 18 Nov 2006 00:00:42 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Sat, 18 Nov 2006 00:00:37 +0100 Lines: 111 Message-ID: Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 89-172-60-192.adsl.net.t-com.hr User-Agent: Thunderbird 1.5.0.4 (X11/20060615) Sender: news Subject: [Fwd: Re: Lockless algorithms [was Re: splxxx replacements?]] X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Nov 2006 07:52:36 -0000 >From DragonFly's mailing list, if anyone's interested... -------- Original Message -------- Subject: Re: Lockless algorithms [was Re: splxxx replacements?] Date: Fri, 17 Nov 2006 12:51:35 -0800 From: Bill Huey (hui) Newsgroups: dragonfly.kernel On Fri, Nov 17, 2006 at 09:04:53AM -0800, Matthew Dillon wrote: > The route table work is rather significant, though it won't really > shine until Giant is removed from the protocol threads. Basically > the route table is now replicated across all cpus and thus can be > accessed by any given cpu with only a critical section for > protection. > > We still use tokens in several major areas, but we have been shifting > the ones that were only being used to lock short codepaths over to > spinlocks. [lock categories] > > RCU is interesting but I personally prefer replication and > cpu-localization instead of RCU. So far nothing has come up that really > needs an RCU implementation. RCU suffers from a level of complexity > that makes it somewhat difficult to understand the code using it, and > I don't like that aspect of it. RCU is a very specific and powerful algorithm that is used in place of traditional reader/writer locks in the Linux kernel. Many of the algorithms (hash tables and the like) are highly parallelized because of it in Linux. The algorithm is dependent on memory barriers to guarantee a certain kind of ordering with regard to a data structure pointers. Read and write barrier guarantee that ordering down to the specific cache line. Replication has an different semantic from what I understand and RCU shouldn't be dismissed. You should know that a paper was presented at OLS 2006 regarding a lockless page cache from Nick Piggin that uses RCU and other lockless techniques to get a highly parallelized page cache. They get a linear increase in performance for the CPU set they are using. It's an impressive feat. I don't see how something like (data ?) replication can handle something like that. RCU read sides are down to the nanoseconds for an acquire which is very fast, faster than an atomic operation by far. They basically get clogged at about 3 CPUs, but with this the get nearly linear increase in performance as CPUs are added. http://www.linuxsymposium.org/2006/linuxsymposium_procv2.pdf Since RCU has different data guarantees than a traditionally locked data structure. The user of the data via the API must have a means of dealing with this, but it will defintely deliever some serious performance with shared memory systems if you do. Also Linux directory cache uses RCU as well and apparently it's a performance problem as well. There are many examples in Linux regard RCU and it would be a good thing to look at for ideas. There are patent issues and the GPL license, but this is just too powerful an algorithm to ignore. In many way, this brings out the ultimate in what shared memory system can do. > The use of tokens in the system is primarily being relegated to situations > that actually need the above characteristics. A very good example of > this is the traversal of system structures such as the system mountlist. > An example of incorrect use would be in, say, kern_objcache.c, where we > really ought to be using a spinlock there instead. Ok, so you're still using tokens for larger subsystems that have long execution paths if I understand you, right ? One of the claims about dfBSD that I found interesting was that the your use of token to break up long kernel paths was an incremental way of MPing the kernel. This is in contrast to the complete removal of Giant in a lock path for a kernel syscall. The question here that I've been wondering if tokens are living up to its claim or not ? That's really the main question I have using it as a general mechanism for MP work in a legacy kernel (I'm thinking about using it for another system that is already using a code path protection scheme). bill From owner-freebsd-arch@FreeBSD.ORG Sat Nov 18 18:17:31 2006 Return-Path: X-Original-To: arch@freebsd.org Delivered-To: freebsd-arch@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 5793816A8B7 for ; Sat, 18 Nov 2006 18:17:31 +0000 (UTC) (envelope-from bde@zeta.org.au) Received: from mailout2.pacific.net.au (mailout2-3.pacific.net.au [61.8.2.226]) by mx1.FreeBSD.org (Postfix) with ESMTP id 7866D43F09 for ; Sat, 18 Nov 2006 18:16:11 +0000 (GMT) (envelope-from bde@zeta.org.au) Received: from mailproxy2.pacific.net.au (mailproxy2.pacific.net.au [61.8.2.163]) by mailout2.pacific.net.au (Postfix) with ESMTP id 9021E6E100 for ; Sun, 19 Nov 2006 05:16:13 +1100 (EST) Received: from katana.zip.com.au (katana.zip.com.au [61.8.7.246]) by mailproxy2.pacific.net.au (Postfix) with ESMTP id 5CD6127413 for ; Sun, 19 Nov 2006 05:16:13 +1100 (EST) Date: Sun, 19 Nov 2006 05:16:12 +1100 (EST) From: Bruce Evans X-X-Sender: bde@delplex.bde.org To: arch@freebsd.org Message-ID: <20061119041421.I16763@delplex.bde.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed Cc: Subject: What is the PREEMPTION option good for? X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Nov 2006 18:17:31 -0000 PREEMPTION may be needed for correctness, including for very low latency switching to high priority tasks, but it is a pessimization for most things that I tried (mainly makeworld), and the details of the pessimizations indicate that it doesn't actually give correctness either. Makeworld in a certain SMP configuration here takes 853+-5 seconds without PREEMPTION or IPI_PREEMPTION, and 863+-5 seconds with PREEMPTION and IPI_PREEMPTION (both without KSE; KSE gives another pessimization in the 5-10 second range). Most of the difference is caused by pgzero becoming too active with PREEMPTION. The behaviour with PREEMPTION under SMP is similar to that under both UP and SMP when the PREEMPTION ifdef was first added to vm_zeroidle.c a couple of years ago. pgzero is active for about 3 times as long (18 seconds instead of 6 for my current makeworld benchmark. This reduces the reported system time by even more than the extra time spent in pagezero, but for some reason (probably cache thrashing, though pgzero uses nontemporal writes on the benchmark machine) it increases the real time by about the same amount as the extra time spent in pgzero. A couple of years ago, pgzero did this even for !SMP because it had a very broken priority so it rarely (never?) got preempted. Now its preemption is broken similarly under SMP with PREEMPTION but without IPI_PREEMPTION, since it takes an IPI to preempt it in many cases. Its code is: % for (;;) { % if (vm_page_zero_check()) { % vm_page_zero_idle(); % #ifndef PREEMPTION % if (sched_runnable()) { % mtx_lock_spin(&sched_lock); % mi_switch(SW_VOL, NULL); % mtx_unlock_spin(&sched_lock); % } % #endif without PREEMPTION, it yields voluntarily, and this works fine. With PREEMPTION and !SMP, it gets preempted, and this works not so fine (it has slightly higher overheads). With PREEMPTION and SMP and >1 CPU but no IPI_PREEMPTION, this cannot work, and even with IPI_PREEMPTION it doesn't work now in practice (IPI_PREEMPTION gives a small pessimization due to more context switches without significantly affecting the time spent in pgzero). If PREEMPTION should actually be best here, why doesn't the main idle thread depend on it? Does anyone have an example where PREEMPTION makes a useful difference? Bruce From owner-freebsd-arch@FreeBSD.ORG Sat Nov 18 21:30:48 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id DF6E116A4A7 for ; Sat, 18 Nov 2006 21:30:48 +0000 (UTC) (envelope-from whole@oralsurgicalinstitute.com) Received: from [83.31.24.223] (cia223.neoplus.adsl.tpnet.pl [83.31.24.223]) by mx1.FreeBSD.org (Postfix) with ESMTP id 44C4343E2F for ; Sat, 18 Nov 2006 21:29:49 +0000 (GMT) (envelope-from whole@oralsurgicalinstitute.com) Message-ID: <000c01c70b58$a8ef6f80$00000000@star> From: "Islands sure" To: freebsd-arch@freebsd.org Date: Sat, 18 Nov 2006 22:29:41 +0100 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0008_01C70B61.0A7520D0" X-Priority: 3 X-MSMail-Priority: Normal X-Mailer: Microsoft Outlook Express 6.00.2900.2869 X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2962 X-Content-Filtered-By: Mailman/MimeDel 2.1.5 Subject: Mac BU reorg Allchin X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Nov 2006 21:30:49 -0000 ------=_NextPart_000_0008_01C70B61.0A7520D0 Content-Type: text/plain; charset="iso-8859-2" Content-Transfer-Encoding: quoted-printable Stocks Quotes in attachement Unified School am District. Story Reuters could unveiled early convention am Posted is Todd a. Donations keep running to navigation in. Reload browser or after! Calculator efc Palmspace. Give sampling done Notescory. ------=_NextPart_000_0008_01C70B61.0A7520D0-- From owner-freebsd-arch@FreeBSD.ORG Sat Nov 18 21:55:20 2006 Return-Path: X-Original-To: freebsd-arch@freebsd.org Delivered-To: freebsd-arch@freebsd.org Received: from mx1.FreeBSD.org (mx1.freebsd.org [69.147.83.52]) by hub.freebsd.org (Postfix) with ESMTP id 7A17F16A500 for ; Sat, 18 Nov 2006 21:55:20 +0000 (UTC) (envelope-from freebsd-arch@m.gmane.org) Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id C660243D68 for ; Sat, 18 Nov 2006 21:55:11 +0000 (GMT) (envelope-from freebsd-arch@m.gmane.org) Received: from list by ciao.gmane.org with local (Exim 4.43) id 1GlY9a-0002S5-4g for freebsd-arch@freebsd.org; Sat, 18 Nov 2006 22:55:14 +0100 Received: from 89-172-42-36.adsl.net.t-com.hr ([89.172.42.36]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 18 Nov 2006 22:55:14 +0100 Received: from ivoras by 89-172-42-36.adsl.net.t-com.hr with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Sat, 18 Nov 2006 22:55:14 +0100 X-Injected-Via-Gmane: http://gmane.org/ To: freebsd-arch@freebsd.org From: Ivan Voras Date: Sat, 18 Nov 2006 22:55:12 +0100 Lines: 9 Message-ID: References: <20061119041421.I16763@delplex.bde.org> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: 89-172-42-36.adsl.net.t-com.hr User-Agent: Thunderbird 1.5.0.4 (X11/20060615) In-Reply-To: <20061119041421.I16763@delplex.bde.org> Sender: news Subject: Re: What is the PREEMPTION option good for? X-BeenThere: freebsd-arch@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussion related to FreeBSD architecture List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 18 Nov 2006 21:55:20 -0000 Bruce Evans wrote: > Most of the difference is caused by pgzero becoming too active with > PREEMPTION. Don't know about the other things but I've noticed pagezero is suspiciously active on heavy loaded SMP web servers (even complained on @stable a long time ago). I'll try disabling PREEMPTION and see how it goes.