From owner-freebsd-current@FreeBSD.ORG Wed Dec 8 02:54:07 2010 Return-Path: Delivered-To: freebsd-current@freebsd.org Received: from mx1.freebsd.org (mx1.freebsd.org [IPv6:2001:4f8:fff6::34]) by hub.freebsd.org (Postfix) with ESMTP id 8E2C0106566B; Wed, 8 Dec 2010 02:54:07 +0000 (UTC) (envelope-from davidxu@freebsd.org) Received: from freefall.freebsd.org (freefall.freebsd.org [IPv6:2001:4f8:fff6::28]) by mx1.freebsd.org (Postfix) with ESMTP id 604A58FC16; Wed, 8 Dec 2010 02:54:07 +0000 (UTC) Received: from xyf.my.dom (localhost [127.0.0.1]) by freefall.freebsd.org (8.14.4/8.14.4) with ESMTP id oB82s5nx082060; Wed, 8 Dec 2010 02:54:06 GMT (envelope-from davidxu@freebsd.org) Message-ID: <4CFEF34D.1030701@freebsd.org> Date: Wed, 08 Dec 2010 10:54:05 +0800 From: David Xu User-Agent: Thunderbird 2.0.0.24 (X11/20100630) MIME-Version: 1.0 To: John Baldwin References: <20101205231829.GA68156@troutmask.apl.washington.edu> <201012060944.03196.jhb@freebsd.org> <4CFD7BB0.20500@freebsd.org> <201012070913.17118.jhb@freebsd.org> In-Reply-To: <201012070913.17118.jhb@freebsd.org> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: freebsd-current@freebsd.org, Steve Kargl Subject: Re: Process accounting/timing has broken recently X-BeenThere: freebsd-current@freebsd.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: Discussions about the use of FreeBSD-current List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2010 02:54:07 -0000 John Baldwin wrote: > On Monday, December 06, 2010 7:11:28 pm David Xu wrote: >> John Baldwin wrote: >>> On Sunday, December 05, 2010 6:18:29 pm Steve Kargl wrote: >>> >>>> Sometime in the last 7-10 days, some one made a >>>> change that has broken process accounting/timing. >>>> >>>> laptop:kargl[42] foreach i ( 0 1 2 3 4 5 6 7 8 9 ) >>>> foreach? time ./testf >>>> foreach? end >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 69.55 real 38.39 user 30.94 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 68.82 real 40.95 user 27.60 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 69.14 real 38.90 user 30.02 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 68.79 real 40.59 user 27.99 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 68.93 real 39.76 user 28.96 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 68.71 real 41.21 user 27.29 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 69.05 real 39.68 user 29.15 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 68.99 real 39.98 user 28.80 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 69.02 real 39.64 user 29.16 sys >>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>> 69.38 real 37.49 user 31.67 sys >>>> >>>> testf is a numerically intensive program that tests the >>>> accuracy of expf() in a tight loop. User time varies >>>> by ~3 seconds on my lightly loaded 2 GHz core2 duo processor. >>>> I'm fairly certain that the code does not suddenly grow/loose >>>> 6 GFLOP of operations. >>>> >>> The user/sys thing is a hack (and has been). We sample the PC at stathz (~128 >>> hz) to figure out a user vs sys split and use that to divide up the total >>> runtime (which actually is fairly accurate). All you need is for the clock >>> ticks to fire just a bit differently between runs to get a swing in user vs >>> system time. >>> >>> What I would like is to keep separate raw bintime's for user vs system time in >>> the raw data instead, but that would involve checking the CPU ticker more >>> often (e.g. twice for each syscall, interrupt, and trap in addition to the >>> current once per context switch). So far folks seem to be more worried about >>> the extra overhead rather than the loss of accuracy. >>> >>> >> Adding any instruction into global syscall path should be cautioned, it >> is worse then before, thinking about a threaded application, a userland >> thread may have locked a mutex and calls a system call, the overhead >> added to system call path can directly affect a threaded application's >> performance now, because the time window the mutex is held >> is longer than before, I have seen some people likes to fiddle with >> system call path, it should be cautioned. > > OTOH, the current getrusage(2) stats cannot be trusted. The only meaningful > thing you can do is to sum them since the total is known to be accurate at > least. > > If it wouldn't make things so messy I'd consider a new kernel option > 'ACCURATE_RUSAGE' or some such. > Our getrusage is already very slow, everytime, it needs to iterate the threads list with a process SLOCK held. I saw some mysql versions heavily use getrusage, and a horribly slow. I think a ACCURATE_RUSAGE will make it worse ?