Date: Thu, 09 Dec 2010 10:22:28 +0800 From: David Xu <davidxu@freebsd.org> To: John Baldwin <jhb@freebsd.org> Cc: freebsd-current@freebsd.org, Steve Kargl <sgk@troutmask.apl.washington.edu> Subject: Re: Process accounting/timing has broken recently Message-ID: <4D003D64.8010409@freebsd.org> In-Reply-To: <201012080929.22825.jhb@freebsd.org> References: <20101205231829.GA68156@troutmask.apl.washington.edu> <201012070913.17118.jhb@freebsd.org> <4CFEF34D.1030701@freebsd.org> <201012080929.22825.jhb@freebsd.org>
next in thread | previous in thread | raw e-mail | index | archive | help
John Baldwin wrote: > On Tuesday, December 07, 2010 9:54:05 pm David Xu wrote: >> John Baldwin wrote: >>> On Monday, December 06, 2010 7:11:28 pm David Xu wrote: >>>> John Baldwin wrote: >>>>> On Sunday, December 05, 2010 6:18:29 pm Steve Kargl wrote: >>>>> >>>>>> Sometime in the last 7-10 days, some one made a >>>>>> change that has broken process accounting/timing. >>>>>> >>>>>> laptop:kargl[42] foreach i ( 0 1 2 3 4 5 6 7 8 9 ) >>>>>> foreach? time ./testf >>>>>> foreach? end >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 69.55 real 38.39 user 30.94 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 68.82 real 40.95 user 27.60 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 69.14 real 38.90 user 30.02 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 68.79 real 40.59 user 27.99 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 68.93 real 39.76 user 28.96 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 68.71 real 41.21 user 27.29 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 69.05 real 39.68 user 29.15 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 68.99 real 39.98 user 28.80 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 69.02 real 39.64 user 29.16 sys >>>>>> Max ULP: 0.501607 for x in [-18.000000:88.709999] with dx = 1.067100e-04 >>>>>> 69.38 real 37.49 user 31.67 sys >>>>>> >>>>>> testf is a numerically intensive program that tests the >>>>>> accuracy of expf() in a tight loop. User time varies >>>>>> by ~3 seconds on my lightly loaded 2 GHz core2 duo processor. >>>>>> I'm fairly certain that the code does not suddenly grow/loose >>>>>> 6 GFLOP of operations. >>>>>> >>>>> The user/sys thing is a hack (and has been). We sample the PC at stathz (~128 >>>>> hz) to figure out a user vs sys split and use that to divide up the total >>>>> runtime (which actually is fairly accurate). All you need is for the clock >>>>> ticks to fire just a bit differently between runs to get a swing in user vs >>>>> system time. >>>>> >>>>> What I would like is to keep separate raw bintime's for user vs system time in >>>>> the raw data instead, but that would involve checking the CPU ticker more >>>>> often (e.g. twice for each syscall, interrupt, and trap in addition to the >>>>> current once per context switch). So far folks seem to be more worried about >>>>> the extra overhead rather than the loss of accuracy. >>>>> >>>>> >>>> Adding any instruction into global syscall path should be cautioned, it >>>> is worse then before, thinking about a threaded application, a userland >>>> thread may have locked a mutex and calls a system call, the overhead >>>> added to system call path can directly affect a threaded application's >>>> performance now, because the time window the mutex is held >>>> is longer than before, I have seen some people likes to fiddle with >>>> system call path, it should be cautioned. >>> OTOH, the current getrusage(2) stats cannot be trusted. The only meaningful >>> thing you can do is to sum them since the total is known to be accurate at >>> least. >>> >>> If it wouldn't make things so messy I'd consider a new kernel option >>> 'ACCURATE_RUSAGE' or some such. >>> >> Our getrusage is already very slow, everytime, it needs to >> iterate the threads list with a process SLOCK held. I saw some mysql >> versions heavily use getrusage, and a horribly slow. I think a >> ACCURATE_RUSAGE will make it worse ? > > Using 'time foo' only retrieves the usage once at the end via wait(). > > However, FWIW, I think ACCURATE_RUSAGE would simplify calcru1() quite a bit > since you would just sum up the thread's usage (as we do know), but then do a > direct conversion from bintime to timeval for user and system without ever > having to worry about time going backwards, etc. All the hacks to enforce > monotonicity that are currently in place would go away since we would have > the real measurements and not be inferring them from statclock ticks. > I don't worry getrusage, though it is not necessary that it is used with wait(), I saw mysql used it to measure its internal code's performance, it did not use it with wait(), it is questionable if this usage is correct. However, I worry anyone who adds overhead to system call path.
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?4D003D64.8010409>