Skip site navigation (1)Skip section navigation (2)
Date:      Sun, 28 May 2017 20:16:53 +0000
From:      Rick Macklem <rmacklem@uoguelph.ca>
To:        "freebsd-current@freebsd.org" <freebsd-current@freebsd.org>
Cc:        Andriy Gapon <avg@FreeBSD.org>, "cem@freebsd.org" <cem@freebsd.org>, "jeff@freebsd.org" <jeff@freebsd.org>, Ryan Stone <rstone@FreeBSD.org>, "Colin Percival" <cperciva@freebsd.org>
Subject:   Re: NFS client perf. degradation when SCHED_ULE is used (was when SMP enabled)
Message-ID:  <YTXPR01MB0189CF0DCA909BA23F4A4F7BDDF20@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YTXPR01MB0189FAF118B27C0E6F9B169EDDFD0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
References:  <YTXPR01MB01894DA2879C95E634C792D9DDFC0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>, <YTXPR01MB0189FAF118B27C0E6F9B169EDDFD0@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>

next in thread | previous in thread | raw e-mail | index | archive | help
I wrote:
[stuff snipped]
> So, I'd say either reverting the patch or replacing it with the "obvious =
change" mentioned
> in the commit message will at least mostly fix the problem.
"mostly fix" was probably a bit optimistic. Here's my current #s.
(All cases are the same single threaded kernel build, same hardware, etc. T=
he only
 changes are recent vs 1yr old head kernel and what is noted.)

- 1yr old kernel, SMP, SCHED_ULE                94minutes
- 1yr old kernel, no SMP, SCHED_ULE         111minutes

- recent kernel, SMP, SCHED_4BSD              104minutes
- recent kernel, no SMP, SCHED_ULE           113minutes
- recent kernel, SMP, SCHED_ULE,
   r312426 reverted                                          122minutes
- recent kernel, SMP, SCHED_ULE                 148minutes

So, reverting r312426 only gets rid of about 1/2 of the degradation.
One more thing I will note is that the system CPU is higher for the cases t=
hat run
with lower/better elapsed times:
- 1yr old kernel, SMP, SCHED_ULE            545s
- 1yr old kernel, no SMP, SCHED_ULE       293s
- recent kernel, no SMP, SCHED_ULE        292s
- recent kernel, SMP, SCHED_ULE             466s

cperciva@ is running a highly parallelized buuildworld and he sees better
slightly better elapsed times and much lower system CPU for SCHED_ULE.

As such, I suspect it is the single threaded, processes mostly sleeping wai=
ting
for I/O case that is broken.
I suspect this is how many people use NFS, since a highly parallelized make=
 would
not be a typical NFS client task, I think?

There are other changes to sched_ule.c in the last year, but I'm not sure w=
hich
would be easy to revert and might make a difference in this case?

rick
ps: I've cc'd cperiva@ and he might wish to report his results. I am hoping=
 he
      does try a make without "-j" at some point.



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?YTXPR01MB0189CF0DCA909BA23F4A4F7BDDF20>