Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 15 Dec 2011 17:26:27 +0100
From:      Attilio Rao <attilio@freebsd.org>
To:        Jeremy Chadwick <freebsd@jdc.parodius.com>
Cc:        "O. Hartmann" <ohartman@mail.zedat.fu-berlin.de>, Current FreeBSD <freebsd-current@freebsd.org>, freebsd-stable@freebsd.org, freebsd-performance@freebsd.org
Subject:   Re: SCHED_ULE should not be the default
Message-ID:  <CAJ-FndCoxXV-dOT4QAzt-Qs%2BzUyCGfeFPgbAx%2BpTot8SrVXA7w@mail.gmail.com>
In-Reply-To: <20111213073615.GA69641@icarus.home.lan>
References:  <4EE1EAFE.3070408@m5p.com> <4EE22421.9060707@gmail.com> <4EE6060D.5060201@mail.zedat.fu-berlin.de> <20111213073615.GA69641@icarus.home.lan>

next in thread | previous in thread | raw e-mail | index | archive | help
2011/12/13 Jeremy Chadwick <freebsd@jdc.parodius.com>:
> On Mon, Dec 12, 2011 at 02:47:57PM +0100, O. Hartmann wrote:
>> > Not fully right, boinc defaults to run on idprio 31 so this isn't an
>> > issue. And yes, there are cases where SCHED_ULE shows much better
>> > performance then SCHED_4BSD. =C2=A0[...]
>>
>> Do we have any proof at hand for such cases where SCHED_ULE performs
>> much better than SCHED_4BSD? Whenever the subject comes up, it is
>> mentioned, that SCHED_ULE has better performance on boxes with a ncpu >
>> 2. But in the end I see here contradictionary statements. People
>> complain about poor performance (especially in scientific environments),
>> and other give contra not being the case.
>>
>> Within our department, we developed a highly scalable code for planetary
>> science purposes on imagery. It utilizes present GPUs via OpenCL if
>> present. Otherwise it grabs as many cores as it can.
>> By the end of this year I'll get a new desktop box based on Intels new
>> Sandy Bridge-E architecture with plenty of memory. If the colleague who
>> developed the code is willing performing some benchmarks on the same
>> hardware platform, we'll benchmark bot FreeBSD 9.0/10.0 and the most
>> recent Suse. For FreeBSD I intent also to look for performance with both
>> different schedulers available.
>
> This is in no way shape or form the same kind of benchmark as what
> you're planning to do, but I thought I'd throw it out there for folks to
> take in as they see fit.
>
> I know folks were focused mainly on buildworld.
>
> I personally would find it interesting if someone with a higher-end
> system (e.g. 2 physical CPUs, with 6 or 8 cores per CPU) was to do the
> same test (changing -jX to -j{numofcores} of course).
>
> --
> | Jeremy Chadwick =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0jdc at parodius.com=
 |
> | Parodius Networking =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 =C2=A0 http://www.parodius.com/ |
> | UNIX Systems Administrator =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 =C2=A0 Mountain View, CA, US |
> | Making life hard for others since 1977. =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 =C2=A0 PGP 4BD6C0CB |
>
>
> sched_ule
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> - time make -j2 buildworld
> =C2=A01689.831u 229.328s 18:46.20 170.4% 6566+2051k 432+4264io 4565pf+0w
> - time make -j2 buildkernel
> =C2=A0640.542u 87.737s 9:01.38 134.5% 6490+1920k 134+5968io 0pf+0w
>
>
> sched_4bsd
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> - time make -j2 buildworld
> =C2=A01662.793u 206.908s 17:12.02 181.1% 6578+2054k 23750+4271io 6451pf+0=
w
> - time make -j2 buildkernel
> =C2=A0638.717u 76.146s 8:34.90 138.8% 6530+1927k 6415+5903io 0pf+0w
>
>
> software
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> * sched_ule test: =C2=A0FreeBSD 8.2-STABLE, Thu Dec =C2=A01 04:37:29 PST =
2011
> * sched_4bsd test: FreeBSD 8.2-STABLE, Mon Dec 12 22:42:54 PST 2011

Hi Jeremy,
thanks for the time you spent on this.

However, I wanted to ask/let you note 3 things:
1) Did you use 2 different code base for the test? (one updated on
December 1 and another one on December 12)
2) Please note that you should have repeated this test several times
(basically until you don't get a standard deviation which is
acceptable with ministat) and report the ministat output
3) The difference is less than 2% which I suspect is really
statistically unuseful/the same

I'm not really even surprised ULE is not faster than 4BSD in this case
because usually buildworld/buildkernel tests are driven for the vast
majority by I/O overhead rather than scheduler capacity. It would be
more interesting to analyze how buildworld does while another type of
workload is going on.

Thanks,
Attilio


--=20
Peace can only be achieved by understanding - A. Einstein



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAJ-FndCoxXV-dOT4QAzt-Qs%2BzUyCGfeFPgbAx%2BpTot8SrVXA7w>