Skip site navigation (1)Skip section navigation (2)
Date:      Wed, 6 Jun 2007 23:24:04 -0700
From:      "Kip Macy" <kip.macy@gmail.com>
To:        "Bruce Evans" <brde@optusnet.com.au>
Cc:        src-committers@freebsd.org, John Baldwin <jhb@freebsd.org>, cvs-src@freebsd.org, cvs-all@freebsd.org, Attilio Rao <attilio@freebsd.org>, Kostik Belousov <kostikbel@gmail.com>, Jeff Roberson <jroberson@chesapeake.net>
Subject:   Re: cvs commit: src/sys/kern kern_mutex.c
Message-ID:  <b1fa29170706062324p793ac8e2ga8dc5bf8ba151a60@mail.gmail.com>
In-Reply-To: <20070607133524.S7002@besplex.bde.org>
References:  <200706051420.l55EKEih018925@repoman.freebsd.org> <3bbf2fe10706050829o2d756a4cu22f98cf11c01f5e4@mail.gmail.com> <3bbf2fe10706050843x5aaafaafy284e339791bcfe42@mail.gmail.com> <200706051230.21242.jhb@freebsd.org> <20070606094354.E51708@delplex.bde.org> <20070605195839.I606@10.0.0.1> <20070606154548.F3105@besplex.bde.org> <20070607133524.S7002@besplex.bde.org>

next in thread | previous in thread | raw e-mail | index | archive | help
Bruce -
  Can you also say how many runs do you do and how much variance there
is between runs?


Thanks.
          -Kip


On 6/6/07, Bruce Evans <brde@optusnet.com.au> wrote:
> On Wed, 6 Jun 2007, Bruce Evans wrote:
>
> > On Tue, 5 Jun 2007, Jeff Roberson wrote:
>
> >> You should try with kern.sched.pick_pri = 0.  I have changed this to be the
> >> default recently.  This weakens the preemption and speeds up some
> >> workloads.
> >
> > I haven't tried a new SCHED_ULE kernel yet.
>
> Tried now.  In my makeworld benchmark, SCHED_ULE is now only 4% slower
> than SCHED_4BSD (after losing 2% in SCHED_4BSD) (down from about 7%
> slower).  The difference is still from CPUs idling too much.
>
> Best result ever (SCHED_4BSD, June 4 kernel, no PREEMPTION):
> ---
>        827.48 real      1309.26 user       186.86 sys
>     1332122  voluntary context switches
>     1535129  involuntary context switches
> pagezero time 6 seconds
> ---
>
> After thread lock changes (SCHED_4BSD, no PREEMPTION):
> ---
>        847.70 real      1309.83 user       169.39 sys
>     2933415  voluntary context switches
>     1501808  involuntary context switches
> pagezero time 30 seconds.
>
> Unlike what I wrote before, there is a scheduling bug that affects
> pagezero directly.  The bug from last month involving pagezero losing
> its priority of PRI_MAX_IDLE and running at priority PUSER is back.
> This bug seemed to be gone in the June 4 kernel, but actually only
> happens less there.  This bug seems to cost 0.5-1.0% real time.
> ---
>
> After thread lock changes (SCHED_4BSD, now with PREEMPTION):
> ---
>        843.34 real      1304.00 user       168.87 sys
>     1651011  voluntary context switches
>     1630988  involuntary context switches
> pagezero time 27 seconds
>
> The problem with the extra context switches is gone (these context switch
> counts are like the ones in old kernels with PREEMPTION).  This result is
> affected by pagezero getting its priority clobbered.  The best result for
> an old kernel with PREMPTION was about 840 seconds, before various
> optimizations reduced this to 827 seconds (-0+4 seconds).
> ---
>
> Old run with SCHED_ULE (Mar 18):
>        899.50 real      1311.00 user       187.47 sys
>     1566366  voluntary context switches
>     1959436  involuntary context switches
> pagezero time 19 seconds
> ---
>
> Today with SCHED_ULE:
> ---
>        883.65 real      1290.92 user       188.21 sys
>     1658109  voluntary context switches
>     1708148  involuntary context switches
> pagezero time 7 seconds.
> ---
>
> In all of these, the user + sys decomposition is very inaccurate, but the
> (user + sys + pagezero_time) total is fairly accurate.  It is 1500+-2 for
> SCHED_4BSD and 1500+-17 for SCHED_ULE (old ULE larger, current ULE smaller).
>
> SCHED_ULE now shows intereting behaviour for non-parallel kernel
> builds on a 2-way SMP machine.  It is now slightly faster than SCHED_4BSD
> for this, but still much slower for parallel kernel builds.  This might
> be because it likes to leave 1 CPU idle to wait to find a better CPU to
> run on, and this is actually an optimization when there is >= 1 CPU to
> spare:
>
> RELENG_4 kernel build on nfs, non-parallel make.
> Best ever with SCHED_ULE (~June 4 kernel):
>         62.55 real        55.30 user         3.65 sys
> Current with SCHED_ULE:
>         62.18 real        54.91 user         3.51 sys
>
> RELENG_4 kernel build on nfs, make -j4.
> Best ever for SCHED_ULE (~June 4 kernel):
>         32.00 real        56.98 user         3.90 sys
> Current with SCHED_ULE:
>         33.11 real        56.01 user         4.12 sys
> ULE has been about 1 second slower for this since at least last November.
> It presumably reduces user+sys time by running pagezero more.
>
> The slowdown is much larger for a build on ffs:
>
> Non-parallel results not shown (litte difference from above).
>
> RELENG_4 kernel build on ffs, make -j4.
> Best ever for SCHED_ULE (~June 4 kernel):
>         29.94 real        56.03 user         3.12 sys
> Current with SCHED_ULE:
>         32.63 real        55.13 user         3.53 sys
> Now 9% of the real time (= 18% of the cycles on one CPU = almost the
> sys sys overhead) is apparently wasted by leaving one CPU idle.  This
> benchmark is of course dominated by many instances of 2 gcc hogs which
> should be scheduled to run in parallel with no idle cycles.  (In all
> these kernel benchmarks, everything except disk writes is cached before
> starting.  In other makeworld benchmarks, everything is cached before
> starting on the nfs server, while on the client nothing is cached.)
>
> I don't have context switch counts or pagezero times for the kernel builds.
> stathz is 100 = hz.  Maybe SCHED_ULE doesn't like this.  hz = 100 is
> about 1% faster than hz = 1000 for the makeworld benchmark.
>
> Bruce
>



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?b1fa29170706062324p793ac8e2ga8dc5bf8ba151a60>