From owner-freebsd-current@FreeBSD.ORG Sat Jan 7 16:24:18 2006 Return-Path: X-Original-To: current@freebsd.org Delivered-To: freebsd-current@FreeBSD.ORG Received: from mx1.FreeBSD.org (mx1.freebsd.org [216.136.204.125]) by hub.freebsd.org (Postfix) with ESMTP id 002CB16A49E for ; Sat, 7 Jan 2006 16:24:17 +0000 (GMT) (envelope-from andre@freebsd.org) Received: from c00l3r.networx.ch (c00l3r.networx.ch [62.48.2.2]) by mx1.FreeBSD.org (Postfix) with ESMTP id 179B743D46 for ; Sat, 7 Jan 2006 16:24:16 +0000 (GMT) (envelope-from andre@freebsd.org) Received: (qmail 52667 invoked from network); 7 Jan 2006 16:27:40 -0000 Received: from dotat.atdotat.at (HELO [62.48.0.47]) ([62.48.0.47]) (envelope-sender ) by c00l3r.networx.ch (qmail-ldap-1.03) with SMTP for ; 7 Jan 2006 16:27:40 -0000 Message-ID: <43BFEB2E.4040303@freebsd.org> Date: Sat, 07 Jan 2006 17:24:14 +0100 From: Andre Oppermann User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8b) Gecko/20050217 MIME-Version: 1.0 To: Matthew Dillon References: <73774.1136109554@critter.freebsd.dk> <20060101035958.A86264@xorpc.icir.org> <43B7E1EC.5090301@mac.com> <200601060636.k066aNYn079015@apollo.backplane.com> In-Reply-To: <200601060636.k066aNYn079015@apollo.backplane.com> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Luigi Rizzo , Poul-Henning Kamp , current@freebsd.org Subject: Re: FreeBSD handles leapsecond correctly X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 07 Jan 2006 16:24:18 -0000 Matthew Dillon wrote: > :Luigi Rizzo wrote: > :> On Sun, Jan 01, 2006 at 10:59:14AM +0100, Poul-Henning Kamp wrote: > :>>http://phk.freebsd.dk/misc/leapsecond.txt > :>> > :>>Notice how CLOCK_REALTIME recycles the 1136073599 second. > :> > :> on a related topic, any comments on this one ? > :> Is this code that we could use ? > :> > :> http://www.dragonflybsd.org/docs/nanosleep/ > : > :I ported the tvtohz change from Dragonfly back to 4.10 and 5-STABLE here: > : > :http://www.pkix.net/~chuck/timer/ > : > :...so anyone who wants to experiment can try it out. :-) > : > :-- > :-Chuck > > It isn't so much tvtohz that's the issue, but the fact that the > nanosleep() system call has really coarse hz-based resolution. That's > been fixed in DragonFly and I would recommend that it be fixed in > FreeBSD too. After all, there isn't much of a point having a system > call called 'nanosleep' whos time resolution is coarse-grained and > non-deterministic from computer to computer (based on how hz was > configured). > > Since you seem to be depending on fine-resolution timers more and > more in recent kernels, you should consider porting our SYSTIMER API > to virtualize one-shot and periodic-timers. Look at kern/kern_systimer.c > in the DragonFly source. The code is fairly well abstracted, operates > on a per-cpu basis, and even though you don't have generic IPI messaging > I think you could port it without too much trouble. > > If you port it and start using it you will quickly find that you can't > live without it. e.g. take a look at how we implement network POLLING for > an example of its use. The polling rate can be set to anything at > any time, regardless of 'hz'. Same goes for interrupt rate limiting, > various scheduler timers, and a number of other things. All the things > that should be divorced from 'hz' have been. > > For people worried about edge conditions due to multiple unsynchronized > timers going off I will note that its never been an issue for us, and > in anycase it's fairly trivial to adjust the systimer code to synchronize > periodic time bases which run at integer multiples to timeout at the > same time. Most periodic time bases tend to operate in this fashion > (the stat clock being the only notable exception) so full efficiency > can be retained. But, as I said, I've actually done that and not > noticed any significant improvement in performance so I just don't bother > right now. Matt, I've been testing network and routing performance over the past two weeks with an calibrated Agilent N2X packet generator. My test box is a dual Opteron 852 (2.6Ghz) with Tyan S8228 mobo and Intel dual-GigE in PCI-X-133 slot. Note that I've run all tests with UP kernels em0->em1. For stock FreeBSD-7-CURRENT from 28. Dec. 2005 I've got 580kpps with fast- forward enabled. A em(4) patch from Scott Long implementing a taskqueue raised this to 729kpps. For stock DragonFlyBSD-1.4-RC1 I've got 327kpps and then it breaks down and never ever passes a packet again until a down/up on the receiving interface. net.inet.ip.intr_queue_maxlen has to be set to 200, otherwise it breaks down at 252kpps already. Enabling polling did not make a difference and I've tried various settings and combinations without any apparent effect on performance (burst=1000, each_burst=50, user_frac=1, pollhz=5000). What suprised me most, apart from the generally poor performance, is the sharp dropoff after max pps and the wedging of the interface. I didn't see this kind of behaviour on any other OS I've tested (FreeBSD and OpenBSD). -- Andre